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Brave new words 


Pierre J. Bancel & Alain Matthey de I’Etang 

Association d’Etudes linguistiques et anthropologiques prehistoriques, Paris’*^ 


Contrary to the received idea that globally spread papa/mama words are 
constantly reinvented by children in different languages, we show here 
that these words are always inherited from the most ancient stages of their 
respective families, with the exception of a number of borrowings - which are 
not innovations, either. We then show that probabilistic calculations aiming to 
demonstrate that global and other remote etymologies might be mere chance 
resemblances are invalid, and that chance cannot be reasonably invoked in the 
cases these calculations deal with. Consequently, the global convergence of 
papa/mama words can only be a trace of a common heritage of all human 
languages. Finally, we link this finding with others, indicating that these words 
must have appeared early, most probably at the very origin of articulate language. 


1. The Proto-Sapiens kinship terms papa, mama and kaka 

Our central claim is that most modern papa/mama words, so widespread in all lan¬ 
guage families worldwide, may be traced back to a common origin. We use the name 
Proto-Sapiens for the original ancestor language from which they have been inherited, 
and which must have been the ancestral language of all known languages spoken by 
modern human beings, who together constitute the species Homo sapiens. 

On archeological and genetic grounds, Proto-Sapiens may be dated between 
200,000 years ago (the approximate earliest date at which our species emerged in 
Africa; McDougall et al. 2005; White et al. 2003) and 50,000 years ago (the approxi¬ 
mate latest date at which our direct ancestors may have left the African continent and 
began to spread their bones, genes, artifacts, and language over the rest of the world). 
However, recent archeological findings from several South African sites - Klasies 
River Mouth (Deacon 2001; Singer & Wymer 1982), Blombos Cave (d’Errico et al. 
2005; Henshilwood & d’Errico 2005; Henshilwood et al. 2001) and Pinnacle Point 


* Mail should be sent to first author: pierrejbancel@hotmail.com. Thanks to Peter MacNeilage, 
Claire Lefebvre, and John Bengtson for their useful remarks and corrections, and to Shahar 
Fineberg and Zofia Laubitz for their fine proofreading Job. Errors, of course, are solely ours. 
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(Marean et al. 2007) - have revolutionized the dating of modern Sapiens anatomy and 
behavior. Until recently, modern behavior was widely believed to have appeared no 
earlier than some 40,000 years before present (yBP). But all these sites have revealed 
numerous unambiguous traces of modern behavior (use of marine food, cooking food 
on hearths, microlithic tools, polished bone tools, personal ornaments, geometrical 
engravings, etc.) older than 80,000 yBP, and up to 164,000 yBP at Pinnacle Point. 
Genetics-based datings like those of the “mitochondrial Eve” around 200,000 yBP 
(Gann et al. 1987), or the split of Khoisan people between 70,000 to 90,000 yBP (Knight 
et al. 2003), as well as the antiquity of the first human occupation of Australia and New 
Guinea (at least 46,000 yBP, and perhaps 60,000 yBP for Australia; see Bowler et al. 
2003; O’Gonnell & Allen 2004) also tend to push Proto-Sapiens back to a date earlier 
than 50,000 yBP, perhaps as far back as some 100,000 yBP. 

1.1 Historical background 

The global distribution of papa/mama words, noted as early as the mid-nineteenth 
century (Buschmann 1852), received its currently accepted explanation in the late 
1950s. Murdock (1959) and Jakobson (1960) - probably drawing on Lubbock (1889) 
or Westermarck (1891), though they do not quote any predecessor in this regard - 
explained that modern papa/mama words must be recent and had resulted from 
constrained, convergent innovations due to child/parent interaction in unrelated 
languages. In particular, Jakobson claimed that mama words derived from the nasal 
murmur mmm... mmm... of suckling babies; he left papa words unexplained - but 
may have considered, from a far-fetched structuralist perspective, that the non-nasal 
counterpart p of nasal consonant m should naturally apply to the non-breastfeeding 
counterpart of the mother, namely the father. 

This theory was not supported by any historical evidence. Its authors relied on the 
growing body of observations of child language acquisition to build an indirect expla¬ 
nation, along the lines of “kinship appellatives resemble each other much too much 
to have arisen by chance. Since conventional wisdom has it that the many language 
families they appear in are unrelated to each other, here is how they might have been 
spontaneously invented in various languages, even though this process has never been 
observed.” In spite of its indirectness and the good bit of wishful thinking it relied on, 
this theory immediately received wide approbation and is still taught in linguistics 
departments as the obvious explanation of the global distribution of papa/mama words. 

Murdock and Jakobsons view was first challenged 35 years later by the American 
linguist Merritt Ruhlen (1994a). He had discovered a new widespread appellative kaka 
‘brother, uncle’, which had escaped the attention of comparatists for a century and a 
half after the global distribution of papa/mama words had become known to linguists. 
Its phonetic form was unlikely to have emerged from the babbling of babies, since 
velars like k are acquired later than labials {p, b, m) and dentals ft, d, n). And it seemed 
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unlikely that its meaning had emerged independently with the same phonetic form in 
many different language families. Ruhlen concluded that the many kaka words he had 
discovered in a range of language families from Eurasia, the Americas and Oceania 
had to have been inherited from a common ancestral Proto-Sapiens language. He also 
suggested that there had to be an inherited component behind the global distribution 
of papa/mama words as well, and that Jakobsons explanation of their origin by conver¬ 
gence was probably “exaggerated, if not completely mistaken” (p. 124). 

Ruhlens discovery prompted us to undertake a global etymological comparison of 
kinship appellatives. We first checked the etymological support of Proto-Sapiens kaka, 
and found literally hundreds of kaka words in parts of the world where they had not 
been documented by Ruhlen, notably Africa (in the Niger-Congo, Nilo-Saharan, and 
Khoisan families), Australia (in most subgroups of the Australian family), and New 
Guinea (in many branches of the indo-Pacific family), as well as in many more lan¬ 
guage families from other continents, such as Afroasiatic, Turkic, Mongolic, Tungusic, 
Uralo-Yukaghiric, Japonic, Burushaski, Sino-Tibetan, Yeniseian, Dravidian, Eskimo, 
and Na-Dene, as well as probably, under a phonetically decayed form, indo-European 
(Bancel & Matthey de I’Etang 2002; see Map 1). 


DENE-CAUCASIAN KA(K)NioBsi’ 

Eyak aqaq MoBr+ 

Haida qa'-MoBr 
TIingit -kd'fcMoBr 


EURASIATIC AKA GdFa, MpBr 


Kwakiutl g^gas thy GdFa 
Niaqualli kukh Br+ , 
Protfr^iguan *khii GdMo 
Chinook gaffa MF 
Maidu ka^ka^^oBr 
Miwok kaka MbBr ^ 
Zuni kaka MoBr 
YukikahaPMoBn 
Natchez gaga MBf+ 
Totonac kuku Un 


vp,^otQrlndO'European *HawH-os M^i^.-MoFa 
Lycian xuga MoFa v 
: Ntttite hufihai Q6Fsi^' 

Wefsli evy^/ir Un ' 

^ ArmsT^n navu-€dFa, • 

Old High Gernran 6-haini Un 
LatIn'aV'US GdFa, avunculus MoBr' 

Freriich a>ei//GdFa, onc/eUn „ 

^^5 AFROASIATIC ‘ 
Ttt^^heq^ififra GdPa Prot<hSemltle *?ax'Br+ 

Ha^^4af^ca GdPa UgarlMc ?ax Br* 

Budunji^kakaGdMo Pr-Ea$t CushMc''aM^ GdMo 
Kareldre ki^ka GdPa Burunge kokb GdFa 

Ongota ?akka GdFa 


Proto-Urallc '^/ta FaFa> FaBr+ Yukaghirxa'xa GdFa, MoBr 


Proto^Altalc 'akk^Br* 
Tqikish Sgalih- 
KhaRcha aha Br+ 
Ulcha aha Br+ 


. .Pf-^kimo *akka-k FaBr 
Aleut kukaq GdMo 
Gllyak akandBr* 
Ainu ak Br+, ak^6^- 
Ryukyuan akalBr+ 

DENE-CAUCASIAN 

Yeniseian/oyUn.Au ] AUSTRiq 

Proto-Sino-TIbetan *quH Un, SpFa {KAKA Br+, MeBr) 
Old Chinese gu? Un Proto-Abstronesiah *keka Sib+ 
Saisiyat kuku GdMo 


Pr-Uto-Aztecan "kwa^dFa ^tdiya kogo GdMo 
Jicaque kokham Un n\ ' 

Matagalpa kuk-eke Un ^'S. Loma ksks MoBr 

Lenca koggo SpFa 
Paya uku MoBr, SpFa 


Tibetan khu FaBr 
KIrantI kuMoBr 
Burmese uh MoBr 
Ao khu MoBr, WiFa 
Burushaski ijgo MoBr 


NILO-SAHARAN 


•J SopinkexoxoGtGdFa Songhai k^gahar GtiFa 
' FuJa kaw6 MoBr Kanuri kaka GdFa, GdCh 


DRAVIDIAN 

Kurukh eqkakas Fa^- 
Kolami kako FaBr- / 


. forulukane kaka Slb+ 
Agta aka Slb+ 
Tagalog kaka Sib+ 
Javanese kakang Br+ 
Motu kaka Sib+ 
Aniwa kaka MoBr 
Vietnamese k^wMoBr 
Proto-Yao ’ko Br+ 


Chibcha kaka GdMo 
Paez kahka MoBr 
Quechua kaka MoBr, WiFa 
Jaqaru kaka Un 
Proto-Maipuran "kok/ MoBr, SpFa 
Pemon kokoi GdMo 
Bakairi kogo MoBr, SpFa 
Yaminahua koka MoBr 
Sabanes koka MoBr, SpFa 
Monde koko MoBr, SpFa 
Kariri kuku MoBr 
Maskoi koko MoBr 
Maxakali xuxya MoBr 
AMERIND 

KAKA KOKO MoBr, SpFa 


Korumfe.kaka GdPa Pr-Nilotic 'koko GdPa 
Gbaya kod GdFa Shilluk kwa GdPa 
Nupe (n<la)-ko GdFa Bari ngogwo SpFa 
Idoma aadooku GdFa, Pokot kuko GdPa, GdCh 
ookddku GdMo Runga kdkd GdMo 
Kambari kAaka GdPa Kara kaka GdPa 
Djarawa kaa'ya GdPa GulakaaGdFa 
Nde kuuyu Br+ 

Shall kaka GdPa Sandawe koko GdFa 
Proto^aritu *-kiiiik^ GdPa Hadza koku GdFa 
BiDbangi tj-kako GdPa Pr-Khoisan kxa Br- 
\^go'q-kaaka GdPa iKung Wkho Igoo B 
Xho» u-khokko GdPa lINg Ikoike GdMo 


Pr-South Dravidian .^kka.Zi+ 

Telugu akka Zi'*- - ’ Hyulnyul kaga MoBr 

Pr-Kui-Pengo 'ak MoTa Wunatnbal gaga MoBr 


NIGER-CONGO 
./CakA GdPa 


KHOISAN 
{KOKO OdPa) 


Parji akka MoFa 

Mape kaka- MoBr+ 
Agarabi kaako MoBr 
Telefomin kokoof MoBr+ 
Kewa kakua GdFa 
Negwa kakwa GdFa 
Yelmek kaga GdPa 
Nasioi kaka MoFa 
Sulka kaka MoBr, SpFa 
INDO-PACIFIC 
KAKA MqBc, GdFa 


Wagaydy kaka MoBr 
Murinbata kaka MoBr 
Kagudju kaka FaFa, MoMo 
Gunwinngu gagak FaFa 
Djamindjung kaka MoBr 
Dieri kaka MoBr, MoBrSoSo 
Maingin gaga MoBr 
Kariera kaka MoBr, SpFa 
Karadjeri kaka MoBr, SpFa 
Angutimi kooko MoBr+ 
AUSTRALIAN 
, KAKA MoBr, GdFa 


Map 1. The global etymology kaka mothers brother, spouses father, grandfather, elder 
brother (sample data) 

Languages are grouped in phyla, themselves arranged in columns according to their approximate 
respective location on the planisphere. Phylum names (e.g. dene-caucasian) appear in capitals above 
or below each column, followed by the most likely original form and the kinship positions it most likely 
referred to; in each row, the language name (e.g. Zuni) is followed by the vernacular word in italics 
(e.g. aga), then by the abbreviated main meaning of the word (Fa ‘father, Mo ‘mother’, Br ‘brother’, 

Zi ‘sister’, Sib ‘sibling’, So ‘son, Sp ‘spouse’, Gd ‘grand’, Pt ‘parent’, Ch ‘child’, e ‘elder’, y ‘younger’, MoBr 
‘mother’s brother’, etc.); in Proto-Austronesian and Austronesian languages, Sib-t- glosses words referring 
to an elder sibling of opposite sex to ego (elder brother of a female, elder sister of a male) 
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On the basis of the data from some 700 languages we had first investigated, we also 
determined that the focal etymological meaning of kaka was ‘mothers brother’ rather 
than ‘uncle’, followed by the less widespread meanings ‘grandfather’ and ‘brother’ 
(Matthey de I’Etang & Bancel 2002). We also suggested that kinship appellatives might 
indeed be much older than Proto-Sapiens, and that their simple phonetic form and 
specific use as calls by babies might have played a crucial role in the emergence of 
articulate language (Bancel & Matthey de I’Etang 2002). 

Further work, relying on a growing database of kinship terminologies (now com¬ 
prising over 2,200 languages), led us to develop our theories about both the Proto- 
Sapiens origin of kinship appellatives (Bancel et al. 2010; Matthey de I’Etang & Bancel 
2005, 2008, in preparation; Matthey de I’Etang et al. 2010) and their role in the emer¬ 
gence of articulate language (Bancel & Matthey de I’Etang 2005, 2008, 2010). 

1.2 Trask and the historical emergence of papa/mama words 

Both Ruhlen’s and our theses, however, were soon opposed by a comparative linguist, 
the late Larry Trask. ^ To defend Murdock’s and Jakobson’s theory of the multiple, spon¬ 
taneously convergent origins of papa/mama words, Trask (2004) reviewed the history 
of these words in various languages, and concluded in favor of their “endless re-creation 
and recycling” (p. 15). It was the first time, in over a century and a half, that an attempt 
was made to substantiate the traditional theory from a historical viewpoint. Indeed, 
Trask’s work was useful in forcing us to descend from a global, essentially statistical, 
viewpoint to the level of individual languages and families, in order to show that these 
words, contrary to Trask’s claims, are not innovations in any particular language, but 
have been preserved throughout the histories of their respective families. Expanding on 
a previous answer (Matthey de I’Etang & Bancel 2008), our main task hereafter will be 
to show, with a wealth of comparative data, that Trask’s study is flawed by fundamental 
fallacies, that none of his examples is an innovation, and that all of them are, instead, 
words that have been preserved over millennia with little or no change. 

1 . 2.1 Inherited papa/mama words in Indo-European languages 

By a radical misinterpretation, Trask confuses papa/mama with father/mother words. 

All his examples of lost or decayed papa/mama words are in fact father/mother words. 


1 . Trask did not quote our work or Ruhlen’s, but there is little doubt that his study was in¬ 
tended as an answer to it, as it was published two years after our first papers had appeared 
(Bancel & Matthey de I’Etang 2002; Matthey de I’Etang & Bancel 2002) in the comparative 
linguistics journal Mother Tongue, of which Trask was an assiduous reader and contributor, 
always to defend the traditional view that no trace of common linguistic inheritance older 
than a few millennia should be taken seriously. 
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normal words of the standard adult lexicon, used to refer to any parent rather than to 
address ones own. Let us begin with the Indo-European family,^ from which he draws 
numerous examples of “innovated” papa/mama words. 

The Proto-Indo-European (PIE) words *pater ‘father’ and *mater ‘mother’ are, as 
Trask (2004, p. 12) himself says, 

mama/papa words which have acquired a suffix -ter .... Already these words were 
being treated like other words in the language. Since PIE, the original words for 
‘mother’ and ‘father’, where they have survived at all, have undergone the usual 
changes in pronunciation in the languages possessing them, 

like Swedish far and mor, French pere and mere, or Irish athair (phonetically [ahir]) 
and maihair [ma:hir]. So Trask concludes: 

It is scarcely likely that anyone would recognize [ahir] as a mamalpapa word, 
but in origin it definitely is. The mamalpapa words are in no way resistant to the 
process of linguistic change, including regular changes in pronunciation. Nor are 
they resistant to loss. (Trask 2004, p. 12) 

There is not the least doubt that PIE *pater and *mater, evidently derived from preex¬ 
isting *pa(pa) and *ma(ma), “in origin definitely” were papa/mama words. 

But in origin only. Already in PIE, *pater and *mater were no longer papa/mama 
words - simple reduplicative words mimicking the babbling of babies and used to 
address one’s own parents. Instead, they had become father/mother words - ordinary 
words of the PIE lexicon, used to refer to anyone’s parents, as are all their derivatives 
in modern languages; English father and mother, German Vater and Mutter, Swed¬ 
ish/ar and mor, Icelandic/adir and moBir, French pere and mere, Spanish and Italian 
padre and madre, Occitan paire and maire, Irish athir and mathir, Greek pateras and 
mitera, Armenian hayr and mayr, Persian padar and madar, Ossetic fyd and mad, and 
hundreds of others. Word replacement and phonetic change had to - and obviously 
did - apply normally to these normal words of the adult lexicon. 

And *pater and ’•^maier certainly were not the first words of PIE-speaking children 
some 7,000 years ago, any more than father and mother are the first words of English 
children today, or pere and mere those of French children. 


2 . The Indo-European language family, whose discovery in the end of the 18th century and 
further exploration in the 19th gave birth to linguistic science, comprises most groups of 
languages spoken in Europe today (Celtic, Italic, Germanic, Baltic, Slavic, Albanian, Hellenic, 
Armenian), as well as the huge Indo-Iranian group, itself divided in three subgroups (Indie, 
Nuristani, and Iranian); it also includes two extinct groups, Anatolian and Tocharian. The 
reader unfamiliar with language classification will find members of each group listed in Ap¬ 
pendices A to C and G, with examples of common words. 
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Moreover, Trask does not document a single papa/mama word known to be lack¬ 
ing in a given stage of a language’s history, which appeared in a subsequent stage. He 
merely assumes that, in every language where papa/mama and father/mother words 
coexist, the former must be more recent than the latter. And in doing so, he often goes 
against their known etymology. 

As we will see in detail below, all of his “new” papa/mama words have been inher¬ 
ited from the most ancient stages of their respective language families. When PIE 
*pater and *materweie derived as reference terms from Pre-PIE *pa(pa) and *ma(ma), 
the more ancient forms did not disappear. *Pa(pa) and *ma(ma) must have been kept 
in parallel use as appellatives, just as in English father coexists with dad, and mother 
with mom. The reader is referred to Appendix A, which displays the etymological 
series supporting PIE *ma(ma) ‘mother, mom’ in the Tower of BabeP Indo-European 
database (Nikolayev 2007), completed by other data. From Prakrit mdmikd ‘mother’. 
Classical Greek md (gd) ‘(Earth) Mother’ and Latin mamma ‘mommy’ to Punjabi md 
~ mdu ~ mdi ~ mdmmi ‘mother’, Persian mdm ‘mom’, Armenian mam ‘grandmother’. 
Modern Greek mama ‘mom’, Ukrainian mama ‘mom’, Latvian mdma ‘mom’, Faeroese 
mamma ‘mom’, Sutsilvan Rumantsch moma ‘mom’, French maman ‘mom’, mamie ~ 
meme ‘granny’, Breton mam ‘mother’, or Gheg Albanian mame ‘mother’, more than a 
hundred languages from the vast majority of IE subgroups unambiguously establish 
the PIE antiquity of this word. 

Nikolayev (2007) does not posit a PIE root *pa or *papa. Its existence, however, 
cannot be doubted given the comparative data of Appendix B, which provides some 
170 papa words from well over a hundred IE languages, from Palaic pdpa ‘father’, 
Prakrit bappa ‘father’, Khwarezmianpapa ‘father’. Classical Greekpoppa ‘dad', pappous 
‘grandfather’ or Latin pappa ‘dad’ and pappus ‘grandfather’, to Marathi bdp ‘father’, 
Kamv’iri vov ‘grandfather’, Farsi baba ‘father, grandfather’, Armenian pap ‘granddad’. 
Modern Pontic Greek papa ‘dad’, Latvian paps ‘dad’, Danish papa ‘dad’, or Occitan papa 
‘dad’ and papet ‘grandfather, granddad’. 

The Proto-Indo-European descent of these words eliminates many of Trask’s 
“innovations”; Greek mama, Icelandic mamma and pabbi, Norwegian mamma and 
pappa, French maman and papa, Italian mamma and babbo, Polish mama, Bengali ma 
and baba, Hindi baba or bap, Persian mdm and baba, Latvian mama and paps (Trask 


3 . The Tower of Babel Project (http://starling.rinet.ru/) brings together the Russian State 
University of the Humanities (Moscow, Russia), the Moscow Jewish University (Russia), the 
Russian Academy of Sciences, the Santa Fe Institute (New Mexico), the City University of 
Hong Kong (China), and the Leiden University (The Netherlands). It provides free access to 
etymological databases for numerous language families, compiled by some of the best special¬ 
ists worldwide. In our etymological lists, unreferenced data not drawn from Nikolayev (2007) 
may be found in easily accessible standard dictionaries. 
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2004, pp. 13-14) all directly derive from PIE appellatives *mama and *papa, which 
themselves must be even older than PIE, since in PIE times their derivatives *pater 
‘father’ and *mater ‘mother’ were already well established. 

Two double examples, jointly presented by Trask to illustrate the converging pro¬ 
cess of innovation in kinship appellatives, are worth special consideration: 

The ancestral PIE words [*mater and *pater] have been completely lost in a 
number of the daughter languages, lost and replaced by other words. Two of 
those languages are Romanian and Welsh [...]: 

‘mother’ ‘father’ 

Romanian mama tata 

Welsh mam tad 

But look at the words which have replaced the lost older ones! The newer words 
which have replaced the older ones are themselves mama!papa words. According 
to the Proto-World account, ... [t]he mama!papa words are supposed to be no 
more than ancient survivals, and they can’t do anything except survive for a while 
longer or disappear. They absolutely can’t reappear in languages which have lost 
them. But they do. And they do it all the time. (Trask 2004, p. 12) 

Reappear all the time? Romanian mama certainly did not (re)appear out of the blue, 
nor did Welsh mam. The data in Appendix A establish that they were inherited from 
Latin mamma and Proto-Celtic *mama, respectively, and that both ultimately derive 
from PIE *mama. They are exactly the “ancient survivals” Trask does not want to see in 
them. And this has long been known to etymologists (Romanian: Meyer-Liibke 1911; 
Academia Romana 1998; Welsh: Charles-Edwards 1993, p. 169). 

But what about Romanian tata ‘father, dad’, and Welsh tad ‘father’? Could they 
be “newer words which have replaced the older ones”? They could not. Romanian 
tata has been known for a century to derive from Latin tata ‘dad’ (Meyer-Liibke 1911; 
Cioranescu 1958-66; Academia Romana 1998). According to Charles-Edwards (1993, 
p. 169), Welsh tad goes “back at least to the Romano-British period” (43 CE to early 
5th century), as it is found in all the ancient stages of the Brythonic group of Celtic 
(Old Cornish, Middle Welsh, and Middle Breton). And Old Irish (a language belong¬ 
ing to the Goidelic group) data ‘foster father’ shows that the word must be of Proto- 
Celtic origin. 

But their antiquity in their respective language groups - Romance and Celtic - is 
not the end of their story. Both words belong to the PIE etymology *tata ‘dad, father’ 
reported in Appendix C, again based on Nikolayev (2007) and completed with data 
from various sources. Once more, from Hieroglyphic Luwian tati(a)- ‘father’, Vedic 
Sanskrit tata ‘father’. Old Avestan td ‘father’. Classical Greek tata ‘daddy’, or Latin tata 
‘dad’ to Kamv’iri tot ‘father’, Roshani taat ‘father’, Czech tata ‘father, dad’, Latvian tete 
‘dad’, Romanian tata ‘father, dad’, Breton tad ‘father’, or Albanian tate ‘father’, both 
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ancient and modern data from most subgroups abundantly testify to its inheritance 
from the earliest PIE stages. 

To sum up, Trasks claim that Romanian tdta and Welsh tad are words that have 
recently (re)appeared is, again, contrary to obvious etymological and comparative facts. 

1 . 2.2 Inherited papa/mama words in Dravidian and Turkic languages 
Let us also consider two non-Indo-European examples cited by Trask, in Tamil and 
Turkish, respectively. In Trask’s (2004, p. 14) view, the “informal” Tamil word appaa 
‘dad’ is newer than the “formal” takappan ‘father’. But it simply cannot be. The hon¬ 
orific takappan is a compound formed from tak, an adjective form of verb taku ‘to be 
excellent’, and appan ‘father’ (Emeneau 1953, p. 342, 10). And Tamil appan ‘father’ is 
itself a suffixed derivative of appaa ‘dad’, just as PIE *pater was a suffixed form of *pa-, 
as revealed by the comparative data in Appendix D, drawn from the classical etymo¬ 
logical dictionary of Dravidian. 

The earliest trace of Tamil appaa is found in a 3rd-century CE inscription, used as 
a masculine honorific suffix (Mahadevan 2003, p. 609), as in Modern Kannada, Tulu, 
and Telugu. And appaa evidently derives from Proto-Dravidian. 

With regard to Turkic, as Trask himself says, the inherited word for ‘father’ is ata, 
and this word is “still the everyday word in most Turkic languages.” But, in Modern 
Turkish, 


the word ata [has become] an elevated word meaning ‘forefather, ancestor’, [and] 
the everyday word for ‘father’ is now baba. This, of course, is another mama/papa 
word, and it used to be the Turkish word for ‘daddy’, but now it is the ordinary 
word for ‘father’, and ‘daddy’ must now be expressed by adding a diminutive 
suffix, producing babacik. (Trask 2004, p. 13) 

To Trask, this succession of a meaning shift, a replacement, and a suffixation illustrates 
the idea that nursery words change ceaselessly. Proto-Turkic *ata ‘father’ is indeed 
reflected in many ancient and modern Turkic languages, from Old Uighur ata to Sary- 
Yughur ata through Tuvin ada, Azeri ata, and Khakassian ada, all meaning ‘father’ 
(Appendix El). After 1,300 years, most of these terms remain identical to Proto- 
Turkic. In Turkic languages, preservation of *ata has been the rule. Purthermore, its 
meaning shift to ‘forefather, ancestor’ in Modern Turkish is quite a common one. One’s 
father is one’s closest male ancestor, and in nearly all languages words meaning ‘father’ 
may also refer to other male ascendants, or even brothers and male descendants. This 
was the source of a vast majority of their semantic changes, which are merely expan¬ 
sions or retractions of their scope within the narrow field of kinship relationships, 
mostly within the same gender. 

For its part, the diminutive babacigim ‘daddy’ (rather than babacik, which is not 
found in a single Turkish dictionary) does not replace baba (found in all Turkish 
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dictionaries with the meaning ‘dad’) in Turkish children’s first words nor in their 
parents’ baby talk, any more than English daddy replaces dad, Italian babbino replaces 
babbo, or French papoune or papounet replaces papa. They are affectionate diminu¬ 
tives, and may continue to coexist for centuries with their respective root words baba, 
dad, babbo, and papa, or perhaps enter the standard language with a new meaning. But 
baba, dad, babbo, and papa will remain, because babies need them to learn to speak, 
and parents to teach children, as will be explained in Section 3 below. 

Finally, as for Turkish baba itself, far from being new, it was borrowed from 
Persian (Ni^anyan 2001) after the Turks invaded the Persian Empire, a borrowing cer¬ 
tainly facilitated by the existence in Turkish of another old Turkic word, aba ‘father, 
ancestor’, preserved in many Turkic languages (Appendix E2). 

Borrowed? Yes, papa/mama words may be borrowed, and indeed they are prob¬ 
ably more frequently borrowed than any other words in the basic lexicon. We have 
already met Greek baba, borrowed from Turkish - which had previously borrowed 
it from Persian. Albanian baba was also borrowed from Turkish during the Ottoman 
domination over the Balkans. English dad, an isolated form in the Germanic group of 
Indo-European, whose other members aU have papa forms (Appendix B), was likely 
borrowed from Brythonic Celtic, where tad ~ tat forms are general (Appendix C), 
when the Anglo-Saxons invaded Great Britain. It is also likely that Romanian tdtd 
‘father, dad’, a descendant of Latin tdta ‘dad’, which even replaced in Romanian the 
outcome of Latin pater ‘father’, was helped to survive - and thrive - by the forms tata 
‘father, dad’, which are general in the surrounding Slavic languages from which Roma¬ 
nian borrowed thousands of other words, while many other Romance languages lost 
Latin tdta and preserved pappa instead. 

But borrowing is not an innovation, in the sense of a newly created word. 
A borrowed word has a history in the donor language, and the receiver language con¬ 
tinues this history. In the case of Turkish baba, as we have seen, its Persian source 
derives from Proto-Indo-Iranian *baba, itself from Pre-Proto-Indo-European *papa 
(Appendix B). Brave new word. 

1 . 2.3 Inherited papa/mama words in Chinese languages 

The case of Chinese, not studied by Trask, also deserves consideration. In nearly all 
modern Chinese languages, from Mandarin to Cantonese, address terms used for one’s 
father and mother are pa ‘dad’ and ma ‘mom’, respectively (see Appendices El and F3). 
Only their tonal contours vary according to dialect. Both pa and ma have reduplicated 
variants, respectively papa and mama, felt to be more childish by speakers (Agnes 
Gaudu, personal communication). 

In the Chinese Characters (Starostin 2006) and Modern Chinese Dialects (Wang 
2004) Tower of Babel databases, modern pa ‘dad’ forms are assigned an etymology 
dating back to Preclassic Old Chinese pa? ‘father’, implying very little variation over 
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some 3,500 years (Appendix FI). The ancient forms are apparently shared with the ety¬ 
mology of the referential word ‘father’ (Appendix F2), derived from Middle Chinese 
pii, a dialectal form attested since the Tang period (seventh to tenth centuries CE), 
evolved into Beijing, Jinan, or Xi’an fu^, Shuangfeng Chaozhou Fuzhou 

Shanghai vw^^, etc. (superscript numbers indicate tones). 

In the phonetic form pa? ‘father’ of Preclassic Chinese reconstructed by Starostin 
(2006), the final glottal stop -? is essentially posited to explain the tonal evolutions in 
modern dialects, inspired from regular correspondences in other words. But in words 
like ‘dad’ and ‘mom’, the evolution of whose tonal pattern is highly likely to have been 
influenced by expressive intonational patterns, this particular final -7 does not need 
to ever have existed. And it surely did not, given that it is not present in even a single 
modern Chinese dialect. 

Indeed, what happened in Chinese seems clear. From a Preclassic pa, a form 
pwd ~ pw6 ‘father, dad’ appeared during the Han period and progressively special¬ 
ized as a referential term, giving rise to Middle Chinese pii ‘father’, from which 
all the modern forms fu ‘father’ derive. Meanwhile, pa continued to be used as an 
address term in the spoken language and was transmitted without any change in all 
Chinese dialects. However, the pictogram that originally read pa received the pho¬ 
netic reading of the reference term, showing that pa was originally used for both 
address and reference. 

The identical (except for tones) pa forms of all modern dialects - known not only 
by ideograms, but by phonetic descriptions as well - prove that pa survived unchanged 
throughout the history of Chinese. And the Eastern Han and Postclassic pwd or pwd 
forms have been misattributed in Starostin’s database - they may not be the phonetic 
ancestors of modern pa ‘dad’ forms but are forerunners in the evolution of modern/u 
‘father’ forms. 

A similar situation appears in the etymology of terms meaning ‘mom’ and ‘mother’, 
although the two terms may have begun to differentiate already in Preclassic Chinese 
(see Appendices F3 and P4). In ancient forms, again, the aspirated initial m^- and the 
final glottal stop -7 are reconstructed on the basis of tonal developments in modern 
dialects. Thus, as in the case of pa ‘dad’, both are far from assured and indeed are super¬ 
fluous, given their absence from ma words in all modern dialects and the expressive 
uses of appellatives. 

Just as Proto-Indo-European speakers created *pater ‘father’ and *mater ‘mother’ 
from preexisting *pa(pa) and *ma(ma) words and continued to use them in paral¬ 
lel, Chinese speakers developed new reference (father/mother) words from the Old 
Chinese words pa and ma that were initially used for both reference and address. But 
speakers continued to use in parallel the original pa/ma forms as address terms. The 
new reference terms have strongly evolved in modern Chinese dialects, e.g. Wenzhou 
voy^^ or Shuangfeng ‘father’, or Chaozhou ‘mother’ - in which a non-nasal 
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consonant has even appeared - but in all dialects the address terms pa ‘dad’ and ma 
‘mom’ have remained exactly the same as those used over 3,000 years ago. 


1.3 Summary 

Papa/mama words are exempt from most phonetic evolutions, but may, on occa¬ 
sion, vary phonetically within the limits allowed by babbling as regards vowel 
quality and length as well as consonant gemination and the voiced/voiceless (or 
fortis/lenis) contrast, for example within the Germanic group, German Papa, 
Rhine Franconian Pappe ~ Babbe, Bavarian Babba, Faeroesepdpi, Icelandic pabbi 
‘dad’. 

Due to their once common use to address elders respectfully or youngsters affec¬ 
tionately, they may vary semantically, in general within the same gender, e.g. 
Sogdian bdbay ‘father’, Yaghnobi (a modern descendant of Sogdian) bobo ‘grand¬ 
father’, but may also occasionally be recruited into morphological alternations, 
e.g. Bashkarik mem ‘mother’s mother’, mam ‘mother’s father’, locally introducing 
etymological confusion. 

They may give rise to father/mother words and continue to coexist with them, 
for example, Pre-PIE *papa/*mama having given rise to PIE *papa/*mama and 
*pater/*mater, or Old Chinesepa/ma having evolved into Mandarin paand 
fu^/mu^. 

Papa/mama words may be borrowed, such as Modern Greek baba ‘dad’, borrowed 
from Turkish. Such borrowings are most of the time facilitated by similar preex¬ 
isting words in the target language: Homeric pappa > Hellenistic papa, preserved 
in Modern Pontic, differed from Turkish baba only in consonant voicing. In turn. 
Old Turkish aba ‘father’ differed from Persian baba, borrowed into Turkish, only 
in partial versus full reduplication. 

Ancient languages possessed more papa/mama words than modern ones and 
used them extensively as terms of respect for elders. Certainly, PIE *papa and 
*tata were not exact synonyms; otherwise they would not have been preserved 
in so many of the descendant languages, always with different meanings in lan¬ 
guages that preserve both. Their original semantic difference, which may have 
resided in their connotations rather than the persons they referred to, remains 
uncertain. Due to semantic overlap and the loss of importance of kinship rela¬ 
tions - which used to be the very essence of the social organization in all hunter- 
gatherer societies, the only way of life of all human beings until some 10,000 years 
ago - some of these words have been lost in historical times, as was Latin tdta ‘dad’, 
lost in Erench, Occitan, Spanish, and Portuguese. 

A few such words, however, do randomly appear in the course of the history of 
individual languages, such as Erench tata ‘auntie’, a diminutive of tante ‘aunt’. Such 
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cases do not stem from babies’ babbling but from adults’ baby talk. Very impor¬ 
tantly, they do not obey the distribution rule Oral stops for males, nasal stops for 
females. This rule was already observed in 75% of languages by Murdock (1959) 
in his survey of words meaning ‘father’ and ‘mother’ in 474 languages, and was 
confirmed by our own statistics bearing on 1,184 languages (Table 1; for a detailed 
analysis, see Bancel et al. 2010). And, given the massive preservation of original 
forms in most languages from all families, innovations may only represent a tiny 
minority of the countless papa/mama words worldwide. 


Table 1. Most prominent meanings of papa, kaka, nana, and mama 


\ Form 

PAPA 

KAKA 

NANA 

MAMA 


Number Percent 

Number Percent 

Number Percent 

Number Percent 


of occ 

(tot. > 

of occ 

(tot. > 

of occ 

(tot. > 

of occ 

(tot. > 

Meaning \ 


100 ) 


100 ) 


100 ) 


100 ) 

F 

288 

59.2% 

12 

2.4% 

38 

6 .2% 

84 

26.1% 

F-i-FB 

106 


3 


4 


82 


FB 

100 

15.0% 

59 

9.2% 

1 

0 .1% 

7 

1 .1% 

FZ 

36 

5.4% 

10 

1 .6% 

48 

7.1% 

31 

4.9% 

M 

20 

6 .0% 

14 

3.0% 

250 

64.0% 

232 

43.1% 

M-i-MZ 

20 


5 


182 


42 


MZ 

8 

1 .2% 

24 

3.7% 

48 

7.1% 

49 

7.7% 

MB 

33 

5.0% 

221 

34.5% 

11 

1 .6% 

105 

16.5% 

B-i- 

100 

15.0% 

111 

17.3% 

28 

4.1% 

4 

0 .6% 

X+ 

27 

4.1% 

64 

10 .0% 

59 

8.7% 

9 

1.4% 

Sib-i- 

7 

1 .1% 

32 

5.0% 

17 

2.5% 

9 

1.4% 

GdF 

134 

20 .1% 

86 

13.4% 

16 

2.4% 

15 

2.4% 

GdM 

45 

6 .8% 

61 

9.5% 

48 

7.1% 

52 

8 .2% 

GdPt 

15 

2.3% 

35 

5.5% 

4 

0 .6% 

12 

1.9% 

GdPt-i- 

GdCh 

42 

6.3% 

31 

4.8% 

4 

0 .6% 

35 

5.5% 

GdCh 

38 

5.7% 

28 

4.4% 

0 

0 .0% 

6 

0.9% 

Ch 

14 

2 .1% 

3 

0.5% 

65 

9.6% 

35 

5.5% 

TOTAL out 

1,033 cognates 

799 cognates 

823 cognates 

809 cognates 

of 1,184 

in 666 languages 

in 641 languages 

in 675 languages 

in 635 languages 

languages 

(56% of sample) 

(54% of sample) 

(57% of sample) 

(54% of sample) 


Figures calculated for 1,184 languages; percentages have been calculated with regard to the number of 
languages attesting one or more words in the series concerned. Not all kinship relations attested for each 
term are listed above: for each series, at least a dozen other relations are attested by a few items. As of 
August 2013, our database comprised more than 2,400 kinship terminologies, and percentages would not 
be very different. (Table from Bancel et al. 2010) 

© 2013 . John Benjamins Publishing Company 
All rights reserved 















Brave new words 345 


Both Murdock’s semantic convergence rates and our own statistics have been calcu¬ 
lated for father/mother words and papa/mama words taken together. The reason is, as 
we found in our own compilation of kinship terminologies, that while words meaning 
‘father’ and ‘mother’ nearly always figure even in the shortest wordlists noted by field 
linguists or anthropologists, the corresponding appellatives are seldom noted - prob¬ 
ably because of their perceived childish nature and their near identity in all languages, 
which result in a kind of disdain towards them. Since father/mother words are much 
less stable than papa/mama words, there is no doubt that the proportion of papa/ 
mama appellative words complying with the oral/nasal distribution rule would be 
much higher than 75%, and probably above 90%. 

However, before concluding that papa/mama words must share a common origin, 
we have to address another possibility. 


2. Chance resemblances? 

The main argument opposed to etymologies linking languages at a greater remove 
than Indo-European or other relatively recent ancestor languages is that the compara¬ 
tive series they rely on might have arisen by chance. To the best of our knowledge, 
this argument has never been leveled at papa/mama words, and we might consider 
it discarded in advance by the wide acceptance of Murdock’s and Jakobson’s theory 
of their spontaneous convergence under the influence of babies’ babbling. If chance 
might have led to this convergence, putting forward or accepting any other explana¬ 
tion would have been absurd. 

Indeed, the true absurdity would be to consider that the massive global conver¬ 
gence of papa/mama words could have arisen by chance. The overwhelming majority 
of these words are traceable to the very origin of their respective language family, in 
which they have survived for millennia - in the case of Indo-European languages, 
for 6,000 to 8,000 years, according to the most likely estimations. How could they 
have spontaneously emerged in different families all over the world with convergent 
meanings and phonetic forms in a distant past, while they have been among the most 
conservative words in the last several millennia? In ancient languages as well, they had 
to have been inherited, even if some may have been borrowed in a minority of cases. 

The primordial role of kinship in the social organization of all peoples before 
the appearance of agriculture - and undoubtedly for eons, as clear precursors of kin¬ 
ship relationships are found in our closest ape cousins, chimpanzees and bonobos (de 
Waal 1982; Fouts 1997) - excludes the possibility that they could be recent inventions. 
Their global distribution definitely excludes generalized, intercontinental borrowings, 
so that the only remaining explanation is that they have been transmitted over doz¬ 
ens of millennia from a common ancestor language. They may only have stemmed 
from a common, Proto-Sapiens origin - an idea which makes sense with regard to 
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both archeological and genetic data about the expansion of Homo sapiens from their 
African homeland some 50,000 to 100,000 years ago. 

We could thus dispense with a detailed refutation of the chance hypothesis. Nev¬ 
ertheless, we will address it here in some detail, as we are convinced that deep-time 
linguistic comparison has much more to tell us about the development of human lan¬ 
guage as far back as the beginning of our species’ expansion, thus shedding light on 
a crucial period in Homo sapiens’ history, with the dramatic acceleration of technical 
evolution and the appearance of food cooking and of personal and graphic ornaments. 
Many scholars think that these changes must be linked to an evolution of human lan¬ 
guage ability, with the most frequently mentioned candidate being the emergence of 
syntactic articulation. We have no doubt that the comparative-historical study of lan¬ 
guages can help to understand this evolution, and we will illustrate this opinion at the 
end of this article (Section 5.1). 

Through a detailed analysis of two tentative probabilistic refutations of deep-time 
etymologies, we will show that proving or disproving Proto-Sapiens etymological 
series by means of probabilities would demand calculations involving many param¬ 
eters, some of which are not easily amenable, if at all, to numerical representation. 
It will also appear that the etymologies subjected to these treatments are beyond the 
point where a probabilistic assessment is necessary. Similarly, regular phonetic corre¬ 
spondences in low-level linguistic families are far beyond the level where chance might 
be involved and are with good reason regarded as indisputable proof of the common 
descent of the words they are found in, without having ever undergone any kind of 
mathematical assessment. 

2.1 Inaccurate calculations 

The probabilistic refutations of deep-time linguistic comparisons known to us fall into 
two categories. The first one is that of historical linguists unfamiliar with the basic 
principles of probabilities. For instance, the Indo-Europeanist Donald Ringe (2002), 
trying to show that Greenberg’s (2000) Eurasiatic^ etymologies are due to chance 
resemblances, overlooks the fact that a probability is a ratio - that is, it describes the 
number of chances for a particular event to happen out of a total number of possible 
events, so that one has 1 chance out of 6 of getting an ace when throwing an ordinary 
die, but only 4 out of 52, or 1/13, when taking a card from a deck. This leads Ringe, in 
six dense pages, to multiply probabilities as he adds parameters that obviously shrink 
them - as if he had found that, when taking a card from each of four decks, there were 


4. Eurasiatic is a macrofamily of languages discovered by Greenberg (2000-2001) encom¬ 
passing the Indo-European, Uralo-Yukaghir, Altaic, Koreo-Nippo-Ainu, Gilyak, Chukchi- 
Kamchadal, and Eskimo-Aleut language families. 
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4 X 4 = 16 “absolute” chances of getting 4 aces, instead of (1/13)^ = = 1/28,651, 

or 1 chance out 0 /nearly 30,000. As a result, Ringe finds that Greenberg had “more 
than 35 quintillion” chances of discovering a first-person pronoun root *m- common 
to 21 language groups from northern Eurasia.® Out of how many possible outcomes, 
he does not mention, not realizing that 35 quintillion chances out of 3,500 quintillion 
would yield a tiny probability of 1%, or 0.01, while if there were 35 octillion possible 
outcomes, it would descend to a minuscule probability of 1 billionth, or 1/10®. No reli¬ 
able conclusions can be drawn from such fanciful calculations. 

2.2 Inaccurate comparative linguistics 

The second category of erratic probabilities is due to scholars unfamiliar with 
comparative-historical linguistics performing apparently correct probabilistic calcula¬ 
tions on irrelevant parameters. This is what the phonetician Louis-Jean Boe does with 
Bengtson and Ruhlens (1994) global - that is, Proto-Sapiens - etymologies, in a study 
whose successive versions (Boe et al. 2003; Boe 2004; Boe et al. 2006) do not show any 
real improvement in this regard. 

2.2.1 Inaccuracy with regard to linguistic taxonomy 

Knowing the proportion of languages that reflect an assumed original root seems 
important to ensure that the assumed cognate words are not random look-alikes: if 
you take a card from each of 52 decks, you may be nearly sure of getting at least one 
ace, and the greatest probability is that you will get four of them. How does this work 
with languages? Boe et al. (2003) count the total number of languages mentioned by 
Bengtson and Ruhlen (1994) in support of all their 27 Proto-Sapiens etymologies. 
They find 1,317 of them, and, assuming that this was the total number of languages 
investigated by Bengtson and Ruhlen, they relate to this total the average number of 
languages cited in support of each etymology. They find that each etymological series 
comprises an insufficient number of languages and families. But their count and its 


5. This first-person root m- is represented in English by me, my, mine, and as a relic of the 
PIE conjugation system in I am. In an unpublished study, we have found that it survived as 
the first-person pronoun root in 99.6% of 494 Indo-European languages and dialects, from 
Icelandic mig to Assamese mok through Portuguese me, Greek me, Russian menja or Pashto 
ma, whose common descent from the PIE root *m- is acknowledged by all Indo-Europeanists, 
including Ringe himself. Only two IE languages, Tocharian A and B, may have lost it. This 
stunning preservation, paralleled in most of the 20 other families alluded to by Ringe, from 
Turkic to Eskimo through Finno-Ugrian and Chukchi-Koryak, shows that chance has nothing 
to do with the presence of this pronominal root in 21 families, most of which also share a 
second-person root t- (English thou, thee, thy, thine) as well as some 70 other grammatical 
roots and hundreds of lexical roots (Greenberg 2000-2001). 
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alleged consequences are simply pointless. When introducing their Proto-Sapiens ety¬ 
mologies, Bengtson and Ruhlen warn that the potential descendant words they quote 
are only examples: 

[S]ince the existence of these roots as characteristic features of the language 
families cited has already been established by other scholars, and is not for the 
most part in question, we do not give the complete documentation for each family, 
limiting ourselves in most instances to an indication of the range of semantic 
and phonological variation within the family. The reader who wishes to see every 
relevant form for a given family should consult the sources cited. 

(Bengtson & Ruhlen 1994, p. 291; emphasis added) 

Let us illustrate Boe et al.s misinterpretation of Bengtson and Ruhlen’s data - not for 
each of the 27 global etymologies, because that would take several books, nor even for 
a single one, but for a single family supporting a single etymology. In support of their 
Proto-Sapiens etymology tik ‘finger, one’, Bengtson and Ruhlen (1994, pp 322-323) 
give 184 reflexes from 165 languages (12.5% ofthe 1,317 languages they quote), includ¬ 
ing a mere 9 reflexes taken from only 6 Indo-European languages: 

Indo-European : Proto-Indo-European '^deik- ‘to show, to point’, *dekrn- ‘ten; 
Italic: Latin dig(-itus) ‘finger’, dic(-dre) ‘to say’, *decem ‘ten; Germanic: Proto- 
Germanic *taihw6 ‘toe’; Old English tahe ‘toe’; English toe; Old High German 
zeha ‘toe, finger’. (Bengtson & Ruhlen 1994, p. 322) 

Does this sample exhaust what Bengtson and Ruhlen could have found in the Indo- 
European family? Well, not exactly. Appendix G1 displays the data mentioned by 
Nikolayev (2007) under the PIE etymology *deike- ‘to show, to point’, completed by 
Pokorny (1959), Lubotsky (no date). Turner (1962-1966) and standard dictionaries of 
various modern languages. While it is still far from exhaustive, it offers 170 derivatives 
of the Indo-European root *deike- ‘to point, to show’ in some 80 languages. As regards 
PIE *dekrn- ‘ten. Appendix G2 lists 250 reflexes from 247 languages, drawn from the 
same sources plus the remarkable compilation of Rosenfelder (no date). The common 
descent of these words is assured by two centuries of Indo-Europeanist comparison 
and, as Bengtson and Ruhlen say, is “not for the most part in question.”® 

Thus, in the Indo-European family alone, over 400 possible reflexes of Proto- 
Sapiens tik add to the 9 examples given by Bengtson and Ruhlen. And Indo-European 
is but one of the 21 families displaying reflexes of Proto-Sapiens tik ‘finger, one’ in 
Bengtson and Ruhlen’s series. 


6 . Only Classical Greek dak-tulos ‘finger’ and its direct Modern Greek descendant dak-tilo 
‘finger’ are not recognized by Indo-Europeanists as related to the series (e.g. Chantraine 1968, 
pp. 249-250) because of their irregularity; we nevertheless think they do belong to it. 


© 2013 . John Benjamins Publishing Gompany 
AH rights reserved 





Brave new words 349 


Boe et al.’s claim, based on language counts, that Bengtson and Ruhlen’s etymolo¬ 
gies are insufficiently supported, and thus likely to have resulted from chance resem¬ 
blances, obviously falls far off the mark. 

Now, should Bengtson and Ruhlen have published such huge lists for all families 
supporting each of their etymologies? From the viewpoint of reconstruction, no. The 
two PIE roots *deik’e- ‘to point’ and *dekm- ‘ten rely on regular phonetic correspon¬ 
dences attested in innumerable other etymological series; hence their validity does 
not depend primarily on the number of reflexes but on the regularity in the detail 
of correspondences. No lists such as those in Appendix G have ever been published 
by any Indo-Europeanist, and this essentially underscores the vacuity of probabilistic 
calculations that do not take into account the fact that Proto-Indo-European is an 
ancestor language. With regard to the earlier history of a particular word, PIE repre¬ 
sents all its descendant languages - those that preserved the word in question as well 
as those that lost it. If a word existed in PIE, the fact that it disappeared from 4, 40, or 
400 descendant languages is irrelevant to the ancestry of this word before PIE, and Boe 
et al.’s method, beyond their misreading of Bengtson and Ruhlen’s warning about the 
incompleteness of their examples, entirely misses this crucial point. Yet Bengtson and 
Ruhlen are quite explicit once again: 

A common criticism is that, with around 5,000 languages to choose from, it 
cannot be too hard to find a word in some African language that is semantically 
and phonologically similar to, or even identical with, some word in an American 
Indian language. ... But this sort of mindless search is exactly the reverse of how 
the comparative method proceeds. The units we are comparing are language 
families, not individual languages.... So instead of drawing our etymologies from 
thousands of languages, we are, rather, limited to [32] families, some of which 
have no more than a few hundred identifiable cognates. The pool of possibilities 
is thus greatly reduced, and accidental look-alikes will be few. 

(Bengtson & Ruhlen 1994, pp. 279-281; emphasis in the original) 

The inequality of languages and proto-languages with regard to their early history also 
affects contemporaneous languages: for instance, a reflex found in a language such as 
Basque or Burushaski, which by themselves constitute long-isolated language families, 
cannot be given the same etymological weight as a reflex found in one of the sev¬ 
eral hundred Romance or Germanic dialects. This evolutionary hierarchy is not easily 
reduced to figures - in particular with regard to disputed taxa, as is often the case of 
subgroupings within accepted families, and nearly always for remote macrofamilies 
and phyla: should Basque be given the weight of a completely isolated language, as if 
the Basques had independently discovered articulate language, or should it be consid¬ 
ered a member of the Vasco-Gaucasian macrofamUy, or of Dene-Gaucasian, a hotly 
disputed phylum whose huge range spans across northern Eurasia far into northwest¬ 
ern North America? 
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Still, from a probabilistic viewpoint, the number of languages in which a word 
from a proto-language did survive may not be entirely irrelevant to its earlier antiquity. 
The two lists in Appendix; G tell us that the two PIE roots *deike- ‘to poinf and *dekm- 
‘ten are among the words that have best resisted loss in the history of IE languages. In 
itself, this resistance shows that these words are able to survive over long periods of 
time, which is a strong a priori argument in favor of their ability to have survived over 
the times that preceded PIE as well. Eor this reason, Bengtson and Ruhlen might have 
published the detailed support of at least one of their etymologies. 

But, whatever the amount of sources and data, we do not see how the taxonomic 
ranking of languages (i.e. the inequality between an ancestor language and its descen¬ 
dants, or between a long-isolated language and a dialect in a large family) could be 
taken into account in a statistical calculation. The recent achievements of cladistics, 
involving sophisticated probabilities, tend to show that it might perhaps be possible; 
but it would demand a serious collaboration between qualified statisticians and com¬ 
parative linguists. 

2 . 2.2 Inaccuracy with regard to phonetic correspondences 

Boe et al.’s probabilistic assessment of the phonetic validity of Bengtson and Ruhlen’s 
series is inaccurate as well. They total the diflferent phonetic forms assumed by Bengtson 
and Ruhlen to descend from each root (Boe et al. 2003, p. 2707), and find it so large 
that, in their opinion, any correspondence would be allowed, and thus meaningless. 
The case of Proto-Sapiens tik ‘finger’, raised by Boe (2004) to illustrate Bengtson and 
Ruhlen’s phonetic laxity, is again enlightening (Table 2). 

According to Boe et al., the large number of different sounds reflecting each origi¬ 
nal sound (20 for t, 21 for i including diphthongs and loss, or even 26 if long vowels are 
counted separately, and 23 for k including loss) reveals Bengtson and Ruhlens laxity in 
selecting their reflexes. And this laxity, of course, has severe probabilistic consequences. 

But a glance at the phonetic nature of the sounds reflecting each sound in t-i-k 
shows that they form consistent sets, each defined by the region of the mouth where 
its member sounds are formed. Since a great majority of consonant evolutions pre¬ 
serve the original place of articulation, these sets thus encompass sounds most likely 
to evolve into one another. 

All consonants reflecting the initial coronal consonant t- of tik are also coronals. 
Coronals constitute a class of sounds pronounced with the tip of the tongue raised 
close to or against the upper front teeth (interdentals, dentals) or just behind them 
(alveolars, post-alveolars). These consonants articulated in the same region of the 
mouth as t are known to derive from earlier t’s in numerous languages. Not a single 
labial such as p, b, p\ /3, f, or v, nor a dorsal like k, g, k’, y, x, or y, which are extremely 
infrequent derivatives of a coronal consonant, appears in the series. Moreover, t itself 
occurs unchanged in 98 words out of 184, or 53.3%. 
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Table 2. Number of occurrences of each reflex sound in the 184 presumed cognates sup¬ 
porting Bengtson and Ruhlens (1994) Proto-Sapiens series tik ‘finger, one 


nr 

t98 

d23 

ts 13 

s 13 

c 6 

t/4 

z4 

tl4 

th 3 

ts'3 


1 

J3 

t'2 

ts 1 

c 1 

tl 

dll 

s 1 

pi 

tl'l 

fl 



i 56 

(i5) 

£33 

(c5) 

a 22 

idl) 

0 18 

(o4) 

u 10 

(u 1) 

(monophthongs, 

I 

£2 

(£-1) 

i2 

32 

32 

6 1 


with long vowels between parentheses) 

ai 3 

ia 2 

ay 2 

ei 1 

yi 1 

ai 1 

ie 1 

ea 1 

oe 1 


(diphthongs) 


_0 (zero) 5 









(loss) 


~k97 

gl4 

?10 


c 6 

h 5 

q4 

kk4 

k'4 

X 3 

k^2 

K 

2k 2 

nk 2 

c2 

ri 

kp 1 

gb 1 

kh 1 

q'^1 

h'kl 

xk 1 

in 


0 (zero) 14 (loss) 


For each of the three sounds t, i, and k, the assumed reflex sounds have been counted. The relatively 
numerous sounds reflecting each of the original consonants t and k constitute mutually exclusive 
sets (with the sole exception of c, as it is a likely derivative of both t and k, particularly in the vicinity 
of an i). Vowels are much less stable in all languages, and the assumed reflexes of i cover the whole 
spectrum of vowel qualities; nevertheless, high front vowels close to i ( 1 , i, e, e) and diphthongs with 
an i or an e make up an overwhelming majority of the total (117 out of 184, or 63.6%). 


If one then compares the set of sounds reflecting t- to that of sounds reflecting the 
final velar -k, one observes that they are mutually exclusive. Nearly all sounds reflect¬ 
ing -k are dorsal consonants like k itself Dorsals constitute another broad class of 
sounds pronounced with the back of the tongue against or close to the hard or soft pal¬ 
ates (palatals and velars, respectively) or the uvula (uvulars). All are known to reflect 
earlier fc’s in numerous languages. The only exception is the postalveolar coronal c 
(with 6 occurrences reflecting -k, or 3.3%), which is a frequent outcome of a former k 
in the vicinity of an i or an e (e.g. Latin civitatem [kiwitate] ‘city’ > Italian citta [citta], 
or centum [kentu] ‘hundred’ > Italian cento [cento]). And k itself occurs unchanged in 
97 words, or 52.7% of the total. 

Obviously, the number of individual sounds reflecting each original consonant 
ought to be related to the number of phonemes that do not reflect this sound. And 
this relationship is easy to establish. No need to investigate the phonetic inventories 
of all the 1,317 languages counted by Boe et al. In their 27 etymologies, Bengtson and 
Ruhlen have used a clear principle: potential reflexes of a consonant essentially fall into 
six categories defined by their point of articulation: with the lips (labials), the tip of the 
tongue (coronals), and its back (dorsals). These articulatory features, which are among 
the most resistant in phonetic evolution, combine with the opposition oral-nasal, also 
very resistant to change. Thus, every consonant in a word has on average 1 chance 
out of 6 of falling into any of the six categories: oral labial, oral coronal, oral dorsal, 
nasal labial, nasal coronal, or nasal dorsal. For a two-consonant root like tik, there is 
(1/6) X (1/6) = 1/36 chances that its two consonants will each fall into a particular cat¬ 
egory. And, in any given language, any two-consonant word root thus has 1 chance out 
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of 36 - or 0.028, a tiny probability indeed - that each of its consonants will fall within 
a particular category. 

This parameter can be calculated correctly after all. And it shows that Bengtson 
and Ruhlen’s alleged phonetic laxity is a strong constraint imposed on the discovery 
of potential reflexes.^ 

2 . 2.3 Inaccuracy with regard to semantic correspondences 

Boe et al. finally find that Bengtson and Ruhlen are lax with regard to meanings as 
well. And this assessment appears to be just as accurate as that regarding sounds: the 
apparent variety is great, but the actual diversity is small. Let us again examine how 
the various meanings of the words reflecting Proto-Sapiens tik ‘finger, one quoted by 
Bengtson and Ruhlen are represented in their data (Table 3). 


Table 3. Number of occurrences of each of the 30 different meanings in the presumed 
cognates supporting Bengtson and Ruhlehs (1994) Proto-Sapiens series tik ‘finger, one’ 


one 

67 

‘finger’ 

37 

‘hand’ 23 arm 

10 

‘ten 9 

‘to show, point’ 

5 

‘toe’ 

5 

‘only’ 

5 ‘five’ 

4 

‘alone’ 4 

‘index finger’ 

2 

‘middle finger’ 

2 

‘only one’ 

2 ‘fingernail’ 

2 

‘thing’ 2 

‘first’ 

1 

‘to say’ 

1 

‘one by one’ 

1 ‘thumb’ 

1 

‘once’ 1 

‘foot’ 

1 

‘with the fingers’ 

1 

‘in hand’ 

1 ‘to carry in hand’ 

1 

‘by ones’ 1 

paw’ 

1 

‘single’ 

1 

‘forefinger’ 

1 ‘palm of hand’ 

1 

‘guy’ 1 


Total number of occurrences = 194 (> 184 because of a dozen words with two meanings). 


Here again, 30 different meanings are represented in Bengtson and Ruhlen’s series. 
But a glance at the number of occurrences of each meaning in their sample immedi¬ 
ately shows that the two main meanings, namely ‘finger’ and ‘one’, which are closely 
linked together by the universal habit of counting on one’s fingers, account for 104 of 
the 194 total meanings, or 53.6%. 

The other, less-represented meanings should not be counted as weakening the 
numerous convergent words meaning ‘finger’ or ‘one’ - or Bengtson and Ruhlen could 
simply have not included them in their series in the first place, just as they did not 
include words meaning ‘elephant’ or ‘carmagnole’, even if they might have fit pho¬ 
netically. Though coherent with the two basic meanings from a historical viewpoint. 


7. We did not take into account the fact that consonant devoicing is respected in 79.9% of 
sounds reflecting t- and in 73.9% of those reflecting -k, nor of the fact that 63.6% of vowels 
are close phonetic images of -i-; these non-exclusive features are more difficult to integrate, 
but may only have a further strong restrictive effect on the probability that the series might 
have emerged randomly. 
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words meaning ‘hand’, ‘five’, ‘once’, etc. represent a bonus, often powerful when they 
are known to descend from an original word with one of the two critical meanings in 
their low-level family. 

Moreover, the validity of an etymological meaning does not depend only or even 
primarily on the number of attestations of each modern meaning reflecting it, but 
much more on the reconstruction of a semantic evolutionary process. The original 
meaning of a word may have survived in few or even none of its descendants, while 
derived meanings may have proliferated. Obviously, Bengtson and Ruhlen’s tik series 
would have been much weaker if each of the 30 different meanings in their 184-word 
sample had been represented by 6 or 7 words, distributed without any evolutionary 
logic over the 21 families where reflexes of tik are found, contrary to what may readily 
be observed in the sample of Indo-European reflexes of tik in Appendix G. 

The probability of finding a root with an initial t- (or any other oral coronal) fol¬ 
lowed by a -k (or any other oral dorsal) with either of the two meanings ‘finger’ or ‘one’ 
is double of that of finding a phonetically fitting word with only one particular mean¬ 
ing. As a result, 1 language out of 18 (instead of 36) should display consonants from 
two particular sets in a word with one of the two meanings ‘finger’ or ‘one’ by the effect 
of chance. This probability of 1/18, or 0.056, is still low, and it should apply, following 
Boe’s method, to all 104 languages where words with one of these two meanings have 
been found. (But we have seen in Section 2.2.1 above that these 104 languages are far 
from being the only ones to take into account, and, moreover, that their number is not 
really relevant.) 

The 90 words with other meanings should be given a higher probability, though 
certainly not of 1 , depending on the number of evolutionary steps separating them 
from the original meaning and on the number of words likely to be reached at each 
step. But calculating their respective probabilities, for each word in each language, 
would require very long investigations, which are not necessary with Bengtson and 
Ruhlen’s etymologies. In the 21 families where they found it - out of their 32 low- or 
medium-level language families covering all existing languages - tik must have had 
‘finger’ or ‘one’ as its etymological meaning in at least Niger-Congo, Nilo-Saharan, 
Afroasiatic, Uralic, Korean, Eskimo-Aleut, Yeniseian, Sino-Tibetan, Na-Dene, Miao- 
Yao, Daic, and Amerind, to which one can likely add Indo-European and Turkic. To 
retain only the most secure ones, there are 12 ancestral languages displaying a root 
meaning ‘finger’ or ‘one’ with an initial coronal and a final dorsal consonant, a pho¬ 
netic configuration which should occur by chance in 1 language (or ancestor language) 
out of 18 - not in 12 out of 32. The actual presence of tik-type roots with secure mean¬ 
ings ‘finger’ or ‘one’ in 37.5% of the world’s language families is thus at least 6.8 times 
above the 5.6% chance level. And this gap between chance and facts could only be 
enhanced, though more modestly, by the 9 other families with less strong semantic 
correspondences. 
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In short, counting the number of different meanings reflecting an original mean¬ 
ing in order to assess the plausibility of an etymological series is, strictly speaking, 
meaningless. For each word reflecting the proposed root in a given language, the plau¬ 
sibility of its semantic derivation (if any) must be assessed in the light of related words 
in its family as well as in closely related families. In Appendbc Gl, we can see that the 
PIE root *deike- ‘to point’ has descendants endowed with verb meanings as different as 
‘to point out, to show, to exhibit, to confess, to say, to teach, to accuse, to manifest, to 
give a sign and others, plus nouns as disparate as ‘direction, region, part, earth, world, 
camping ground, country, village, cultivated field, side, span, hand span, amazement, 
finger, toe, accusation, sign, example, token, dedicace, discourse’, and ‘judge’, total¬ 
ing 31 different meanings (and more could be added). Is PIE *deike- disqualified by 
this variety? Certainly not, because the variety is only superficial, and in each Indo- 
European subgroup meanings are organized into apparent logical evolutionary chains. 
This evolutionary logic cannot be adequately accounted for by a statistical model. 

2 . 2.4 Summary 

The negative conclusions of the probabilistic calculations we have examined (Boe 
2004; Boe et al. 2003, 2006; Ringe 2002) cannot be regarded as valid. 

Although it seems relatively easy to take into account the degree of phonetic 
validity of assumed reflex words, it is very difficult to reduce to figures the dif¬ 
ferences in taxonomic level between languages (the greater etymological weight 
of, e.g., Proto-Indo-European against any of its descendants, or of Basque against 
Gascon), or in logically derived meanings in a linguistic lineage versus meanings 
picked up at random without regard to semantic evolutionary logic (e.g. the logi¬ 
cal validity of deriving ‘toe’ from ‘finger’, against the invalidity of directly deriving 
‘toe’ from ‘to poinf). More work will be necessary to perhaps achieve a satisfac¬ 
tory assessment of etymological series by mathematical means. 

A point that is relatively difficult to conceive and understand is how multilateral 
etymological series differ from phonetically regular etymological series in lower- 
level language families. The latter (like those shown in Appendices Gl and G2, PIE 
*deik’e- ‘to point, to show’ and *dekm- ‘ten, respectively) aim to trace with certainty 
the descent of a root in all the descendant languages. The most powerful tool to 
ascertain that words from different languages belong to such etymologies is regular 
phonetic correspondences, which may practically eliminate any doubt that a partic¬ 
ular word displaying them belongs to a given series, without any probabilistic assess¬ 
ment being needed - not because there is any magic in regular correspondences, 
but because they link together dozens of word series by their constituent sounds in 
metaseries whose appearance by chance would obviously have been highly improb¬ 
able, just as no calculation is needed to realize that, say, getting 200 aces of hearts 
when taking a card at random from each of 200 decks is a near impossibility. 
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Multilateral series, in turn, rely on phonetic correspondences that are often not 
demonstrably regular in the state of our knowledge; in other words, they are not found 
again and again across different word series. But the phonetic nature of these corre¬ 
spondences otherwise complies, within each series, with evolutionary rules that have 
been discovered, over the last two centuries, in low-level families thanks to regular 
correspondences. As we have seen in Section 2.2.2, these rules impose a strong con¬ 
straint on the discovery of potential cognates. This constraint is, however, weaker than 
that posed by regular correspondences themselves and does not warrant that each 
particular word included in a series really belongs to it; nevertheless, if many words 
in a series repeatedly satisfy this constraint, the likelihood that the entire series has 
appeared at random quickly drops. Consequently, a multilateral series warrants the 
authenticity of a root in a proto-language, while none of its assumed descendants may 
be considered to descend from it with perfect certainty - even if, taken collectively, 
most of them must descend from it. 

This apparent paradox was expressed by Bengtson and Ruhlen: 

We do not harbor illusions ... that every etymological connection we propose 
will be found, ultimately, to be correct, but we do believe that the removal of such 
errors as may exist in these global etymologies will not seriously affect the basic 
hypothesis, which does not depend on any specific link for its validity. 

(Bengtson & Ruhlen 1994, p. 292) 

What replaces regular phonetic correspondences in multilateral series is the number 
of families involved in them, and the recurrence of series within a particular group of 
families, such as that of *m- ‘first person and *t- ‘second person, which are (together 
with many others) particular to the group of families Greenberg (2000-2001) calls 
Eurasiatic. 

Many etymologies presented by Greenberg and Bengtson and Ruhlen, including 
the ones discussed above, are so massively supported that no probabilistic calcu¬ 
lation is needed. Just like papa/mama words or, for that matter, reconstructions 
supported by regular sound correspondences, they are far beyond the point where 
sophisticated tools might be necessary. 

However, accurate probabilities might be useful to uncover other, less well- 
preserved roots, to assess disputed taxa, and more generally to enlarge our com¬ 
parative knowledge of remote language families. One can only encourage both 
statisticians and comparatists to continue to address this difficult problem in a 
constructive spirit. 

If papa/mama words have managed to last for several dozen millennia, why could 
not some other words have resisted as well? And perhaps not so few of them - after 
all, Bengtson and Ruhlens 27 Proto-Sapiens etymologies result from the efforts of 
two scholars, while hundreds of Indo-Europeanists have worked over the two last 
centuries on a few dozen closely related languages to unearth some 2,500 PIE roots. 
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3. Why kinship appellatives do not change: Children babbling, parents 

choosing 

Let us now examine two lines of evidence from different fields of the study of language, 
which converge with our own to support the hypothesis that papa/mama words must 
have played a crucial role in the early appearance of articulate speech. 

Papa/mama words have been preserved over the whole history of language fami¬ 
lies with a written tradition, as documented in Section 1 for a number of such families, 
and comparison within language families with no written record shows that such is 
the case for them as well (Matthey de I’Etang & Bancel 2012). Why is it, then, that 
they are not - or, at least, very infrequently - subject to phonetic change and word 
replacement, as all other words are? The reason is simple and compelling, and every 
parent who has raised a child who developed normal speech knows it, but this com¬ 
mon experience has percolated into the domain of scientific knowledge only recently 
and without attracting much attention. Papa/mama words are crucial for babies to 
learn and for parents to teach babies to speak. The actual mode of their transmission 
has been explained by the language acquisition specialist John Locke (1990), and it is 
a nice piece of collaboration between parents and children. 

Around the age of 6 to 9 months, on average, all babies enter the babbling stage 
of language acquisition. Canonical babbling consists of repetitive bababa, papapa, 
mamama, dadada, tatata, and nanana syllables, made up of plain labial or coronal 
nasal or oral stops, plus an open vowel (Oiler 1980). It has long been recognized that 
these syllables are the first to be mastered by children because they are the easiest, 
due to a range of constraints (Westermarck 1891; Jakobson 1960; MacNeilage & Davis 
1990; MacNeilage 2008). 

Among these sequences, parents “recognize” those corresponding to a word in 
their language and reinforce them - notably by repeating them in their standard form 
while pointing a finger at the parent concerned - while they leave unreinforced other 
sequences that do not match with a word in their language, and which the child will 
thus progressively abandon. 

This was Locke’s (1990) great discovery, which definitively falsifies the theory of 
the spontaneous emergence of these words. Or, rather, it falsifies the theory that babies 
invent them alone. Children spontaneously provide a range of syllabic frameworks, 
and parents rectify some of them into the canonical forms of the corresponding words 
in their language: English parents reinforce dadada and mamama into dad and mom 
(or mum), respectively; French parents reinforce papapa and mamama into papa 
and maman, respectively; Turkish parents reinforce bababa and nanana into baba 
and anne (a word related to Proto-Turkic ana ‘mother’, also inherited in Turkish; see 
Appendix E3); and so on. It would never occur to a monolingual English mother to 
induce her daughter to call her anne (even if her own given name is Ann or Annie), 
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nor to monolingual French parents to recognize in their baby’s babbling of dadada a 
word meaning ‘dad’ and to reinforce it. 

This crucial way of transmitting papa/mama words explains why English children 
consistently learn dad and mom, French children papa and maman, Turkish children 
haha and anne, Hindi-speaking children hap and md, and so on. Each of these words 
belongs to the lexicon of a particular language. Children provide the initial spontane¬ 
ous syllabic framework; the exact phonetic form and meaning of each word are taught 
by parents. This fact, which clearly implies lexical inheritance rather than innovation, 
was elusively recognized by Jakobson (1960) in the paper in which he paradoxically 
argued for spontaneous innovations instead of common descent: 

[C]hildren, being prompted and instigated by the extant nursery words, gradually 
turn the nasal interjection into a parental term and adapt its expressive make-up 
to their regular phonemic pattern. 

This “prompting and instigation by extant nursery words” discreetly acknowledges 
the fact that parents reinforce their child’s babbling and shape it into already existing 
words. And behind this teaching stands an uninterrupted transmission from genera¬ 
tion to generation. 

This specific mode of transmission also explains why these words change so 
rarely. When a language is in the process of undergoing a phonetic change that 
should change their form - for instance when stops between vowels change to frica¬ 
tives (a very common type of change), so that haha, papa, dada, and tata should 
become hava, pafa, daza, and tasa, respectively -, the bio- and neuromechanical 
constraints bearing on babies who are learning to speak at that particular time are 
most of the time stronger. Babies do not master fricatives and continue to say haha, 
papa, etc. preventing the change from applying to the word in question; parents 
recognize the form haha or papa they have heard since their own childhood and 
reinforce it rather than the modified form, which in any case exceeds the baby’s 
articulatory capacities. 

As a result, these words are transmitted from one generation to another without 
change, and are unlikely to be lost, since the same spontaneous syllabic frameworks 
reappear every time another child reaches the age of 6 to 9 months and begins bab¬ 
bling - a phenomenon which must have occurred regularly in all human groups that 
have survived long enough for us to know something of their language, and thus have 
covered nearly all periods of phonetic change in all languages. 

Finally, papa/mama words are crucial in another aspect of language transmission. 
In children’s first utterances, they function no differently from animal communica¬ 
tion. They have been dubbed holophrastic words (“a whole phrase in one word”; see 
De Taguna 1927), because they seem to convey information that should be rendered 
in adult language by a complex sentence. Brigaudiot and Danon-Borleau (2002), in a 
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section entitled “Les premiers maman, holophrases ou enonces a un terme” [The first 
maman, holophrases or single-term utterances], quote a century-old analysis: 

Childish mama, translated into advanced speech, does not mean ‘mother’ but 
rather a sentence such as ‘Mama, come here’, ‘Mama, give me...’, or ‘Mama, put 
me in the chair’, or ‘Mama, help me’. (Stern & Stern 1907) 

These holophrases are similar to the calls of young animals “holophrastically” call¬ 
ing their mothers, except that the human baby’s call, contrary to those of all other 
animals, is phonetically articulate: it consists of vowels and consonants arranged into 
syllables. But papa/mama words do not remain mere calls for long. Within a few weeks 
or months, reinforcement by elders, together with the recurrence of the association, 
in the parents’ speech, of one particular reinforced sound sequence with the presence 
of the mother, and of another one with the father, induces the child to establish a link 
between each of these sequences and a particular being in the outside world. And this 
association is crucial, since it opens the door of symbolic meaning for the child. 

In this way, too, parental appellatives play a unique role in the transmission of 
language. And it must have been so for untold ages. 


4. Back to Proto-Human: The Frame, then Content hypothesis 

Papa/mama words have survived - or, rather, their continuous transmission and pres¬ 
ervation was necessary to our ancestors - over the last 2,000 to 10,000 generations. 
During this period, they have been crucial for babies learning to speak - and their par¬ 
ents teaching them - in the nice collaborative effort described by Locke. Why should 
they not have been preserved over the 20,000 to 100,000 generations before that? We 
suggested long ago that kinship appellatives might have been among the very first pho¬ 
netically articulate sounds (Bancel & Matthey de I’Etang 2002), no doubt a long time 
before Proto-Sapiens was spoken. 

At that time, we were not aware of Peter MacNeilage’s (2008; see also MacNeilage, 
this volume) “Frame, then Content” phonetic theory of the origin of speech, first pre¬ 
sented with respect to modern babies by MacNeilage and Davis (1990). This theory 
holds that papa/mama words must be the first sound sequences mastered by a human 
mouth, for compelling phonetic reasons discovered through the observation of lan¬ 
guage acquisition. To understand these reasons, one has to recall that all humans 
speaking a language, whatever their individual differences, are true virtuosos - just 
like all falcons are nonesuch sky-divers, or all whales are outstanding apnea sea-divers, 
as a result of major selective pressures. 

As explained by Lieberman (1992) - whose pioneering work (e.g. Lieberman 
1975, 1985, 2000) opened the door to the study of language evolution, which had 
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remained barred for a century - speaking is the most difficult motor activity, because 
of the extreme speed and precision of the successive motions involved in the articu¬ 
lation of a speech sound string. According to MacNeilage (2008), about 40 different 
muscles are involved in the production of the various speech sounds, each perform¬ 
ing a very different function such as controlling the pressure of the airflow breathed 
out of the lungs, or the tension of vocal cords, opening and closing the nasal airway, 
or giving the vocal tract a particular shape. Based on an average 15 of these muscles 
being involved in each particular sound, and a speech rate of 15 sounds per second, 
MacNeilage arrives at the fantastic number of 225 muscular actions per second in 
speech, or one every 5 milliseconds. Most of them must be effected with millimetric 
precision, and all must be tightly coordinated; otherwise the sounds produced are not 
those intended. Such defects in coordination do indeed happen and are a major source 
of phonetic evolution, showing that when speaking we are always at the extreme limit 
of our capacities, without even being aware of it. 

On the auditory side, the high speed of some 15 to 25 units per second at which 
speech sounds are normally delivered is equally amazing. Hearers decode them easily, 
although it is often beyond the speed limit of 15 units per second beyond which other 
sounds merge into an undifferentiated buzz in the hearer’s perception. And the brain 
areas and connections able to process this high-speed auditory flow can do so effi¬ 
ciently only after appropriate training - that is, learning the language. Just think how 
difficult it is, when you start learning a new language, merely to perceive the sounds 
you are not used to. 

The extreme difficulty of both speaking and hearing an articulate language may 
be the reason why babies spontaneously start babbling in the second half of their first 
year. This universal behavior must rely at least partly on an innate trend, resulting from 
a heavy selective pressure exerted on humans to begin speaking at an early age, so they 
can gain the required fluency, again confirming that articulate speech has long been a 
major feature of the human ecological niche. 

It also explains why babbling consists of plain stops and vowels in the simplest 
syllable sequences. Babbling typically reduplicates the most basic articulate syllables, 
namely consonant-vowel (CV), in CVCVCV... sequences. These syllable sequences 
using only two sounds are the easiest way to produce an articulate speech flow, as they 
require the synchronization of very few muscles into a repetitive, dual motor scheme, 
however long the syllable sequence may be. Moreover, MacNeilage and Davis (1990) 
found in early babbling an inertial pattern whereby the tongue stays in the same posi¬ 
tion for the vowel as it was in for the previous consonant, or the surrounding conso¬ 
nants in reduplicative babbling. Consequently, to produce habaha, an infant initially 
needs only a couple of mandible elevation muscles and a couple of mandible depres¬ 
sion muscles. For dcedcedce, she only needs to add the inferior genioglossus to move the 
tongue forward and up, and for gogogo only two or three muscles are added to the ones 
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used for mandibular oscillation (Peter MacNeilage, personal communication).* From 
the neuromotor viewpoint, this is a huge simplification with regard to the require¬ 
ments of adult speech. 

The complete closure of stops also allows much more variation in the articulatory 
motions than for any other speech sound. No matter what the speed, strength, and 
precision of the closing motion to produce a stop may be - whichever way the airflow 
is closed and reopened, it will produce an acceptable approximation of the intended 
sound. In contrast, other consonants such as fricatives or glides demand millimetric 
precision in their execution, and any deviation from the intended target is likely to 
drastically modify the acoustic output. 

Furthermore, as already noted by MacNeilage and Davis (1990), a babbling 
sequence essentially relies on motions that lower and raise the jaw - a motion over 
which voluntary control has been selected in the human lineage since our distant Gna- 
thostomata ancestors, which appeared some 450 million years ago, acquired a mouth 
with a jaw. 

The articulatory, motor, and syllabic robustness of consonants p, b, m, t, d, n is 
the reason why these speech sounds are the first ones children regularly master in the 
articulated syllable sequences papapa, bababa, mamama, etc. Of course, if one ran¬ 
domly “tries” one’s articulatory organs in order to make a sound, any human phoneme 
(and many other sounds) may result. However, when it comes to reproducing a sound, 
and - which is still more difficult - a sequence of two sounds at will, of course the easi¬ 
est sounds and sequences must be the first to be mastered. 

This is exactly what children do, and what humans learning to speak - either early 
in life or at an older age - must always have done. Both MacNeilage (2008) and we 
(Bancel & Matthey de I’Etang 2002,2005) independently arrived at the conclusion that 
this rule must have been in force since the very beginning of articulate speech - not 
only in Homo sapiens, but in our more ancient human ancestors as well. 


8. Our comparative data converge with MacNeilage and Davis’ (1990) finding concerning 
the detail of vowels in early babbling. According to them, children’s first velar consonants 
occur with a velarized vowel in sequences like gogo or fcoko. While compiling our kaka etymo¬ 
logical series, we were soon struck by finding a high number of koko ~ kuku (or gogo ~ gugu) 
forms, sometimes even predominant over kaka forms, as in Nilotic or Southern Amerind, but 
also occurring sporadically in many other language groups. In contrast, popo ~ bobo and toto 
~ dodo variants of papa ~ baba and tata ~ dada are extremely rare. We could not find any 
consistent semantic correlate of this variation in the vowel. MacNeilage and Davis’ finding 
regarding modern children may be regarded as confirmed by this globally frequent variant. 
Reciprocally, while it does not help to resolve the question of the antiquity of these koko ~ 
kuku variants, it provides an explanation for their existence. 
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5. By way of conclusion: The early steps towards articulate language 

As we have seen, there are three lines of independent findings. The first finding is 
that of MacNeilage and ourselves, based on the phonetics of language acquisition, that 
papa/mama sound sequences are the obligatory first steps towards mastering articu¬ 
late speech, and must have been so throughout human history. The second one is that 
of Locke, showing how children and parents cooperate in the transmission of papa/ 
mama words; even if Locke himself does not consider the issue from an evolutionary 
perspective, there is no doubt that this mode of transmission is not recent in humans. 
And the third is Ruhlens and our own finding, supported by data from thousands of 
languages worldwide, that most papa/mama words can only have been inherited from 
a common Proto-Sapiens language. All three lines of evidence converge on a scenario 
in which kinship appellatives must have early played a prominent role in the evolution 
of speech in humans and might even have been at its very origin. 

Beyond this striking convergence, this scenario has other aspects adding to its 
evolutionary value. In particular, the initial acquisition by babies of phonetic articula¬ 
tion in their babbling stage through meaningless syllable sequences, some of which are 
then given a meaning by parents, seems to be a step towards the solution of a mystery 
that has barely been noted, much less explained, since research about language origins 
has burgeoned. 

Words have to have been invented, however long this invention may have taken.® 
But how? Both phonetic articulation and referential meaning are unprecedented in 
animal history, and both are too complicated to have been developed simultaneously. 
The first step towards the elucidation of their origin must therefore be to discover 
which appeared first. Babbling babies show us that phonetic articulation appears first 
in all contemporary individuals. And it must have been so originally as well, since 
speech is such a difficult activity that, if humans had found another way to convey 
referential meanings in the beginning, they certainly would not have gone to the trou¬ 
ble of trying to move their tongues and lips at an incredible speed from one incred¬ 
ibly weird position to another but would have stuck to the previously used means of 
expression and developed it further. Articulate speech must have been discovered by 
chance, as was the case with all biological evolutions before and after it, and in its sim¬ 
plest form - that of mamama, papapa, bababa sound sequences. It must also have been 
initially used to fulfill previously existing communicative, non-referential functions. 
Only later, probably much later, did its wonderful but highly demanding properties 


9. The reluctance to deal with the emergence of words is most conspicuous in Kenneally s 
(2007) book The First Word. In spite of its title, this summary of the current state of research 
about language evolution does not even allude to the question. 
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allow for a very slow differentiation of sequences based on very few consonants. It 
opened the door to a functional differentiation, which, ultimately, led to the emer¬ 
gence of semantic reference. 

The consonants in kinship appellatives already delineate a simple phonetic feature 
system, based on articulatory motions and the corresponding bundles of neuromotor 
commands, each of which must be called into play with different command bundles 
to produce different consonant sounds. Appellatives also constitute a simple seman¬ 
tic system, based on a few obvious semantic features, the first being the opposition 
between males and females. They thus offer a plausible path to the development of 
structured phonetic and semantic systems, whose interrelated features have made us 
the “symbolic species” (Deacon 1997, p. 87, Figure 3.3). 

Finally, let us allude to the fact that kinship is another uniquely human trait, 
whose insertion in the humanization evolutionary process has hardly been discussed 
before, in spite of the many promising avenues it offers. Articulate language, this 
essentially social human ability, might not have developed without a reinforcement of 
social bonds, and kinship has long been the primary mode of human social organiza¬ 
tion. The antiquity of kinship is warranted by both the universality of kinship systems 
in all known human groups and the existence of precursors of kinship relationships in 
apes. Given the complexity of both language and kinship, it is only natural that they 
have coevolved, further enhancing the plausibility that the first symbolic meanings 
ever acquired by humans concerned kinship relations. 

5.1 How else may Proto-Sapiens aid the study of language origins? 

Finally, let us illustrate briefly how remote etymologies could shed light on other 
aspects of the evolution of language ability. Apart from papa/mama words, the most 
resistant words worldwide are first- and second-person pronouns (Dolgopolsky 1964; 
Pagel 2000). In all families, they display an incredible resistance, as compared to the 
average replacement rates of 13% to 18% per millennium that have been calculated 
for the 100 or 200 most basic words (body parts, natural elements, kinship relations, 
pronouns, basic verbs, etc.). 

In an unpublished study bearing on 494 Indo-European languages and dialects, 
we have found that the PIE first-person *m- and second-person *t- have been lost, after 
6,000 to 8,000 years, in an amazingly small number of descendant languages. Eirst- 
person *m- was lost by only two languages (0.4%), which amounts to a loss rate of 
0.05% per millennium, granting *m- a half-life of 1.38 million years. In turn, *t- has 


10 . The calculation of the half-life of words was devised by Pagel (2000). It is not as reliable 
as its prototype in physics, where one observes the decay of a given quantity of an element 
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disappeared from seven languages (1.4%), which endows it with a loss rate per millen¬ 
nium of 0.18%, and a half-life of 385,000 years. 

Personal pronouns from most language families display similarly minuscule loss 
rates. However, unlike papa/mama words, and contrary to what might be expected 
given this extraordinary longevity, there is no global convergence of phonetic forms 
and meanings in first- and second-person pronouns. We have studied (Bancel & Mat- 
they de I’Etang 2008,2010) the phonetic distribution of pronoun roots in shallow-time 
ancestral pronominal forms worldwide compiled by Ruhlen (1994b, pp. 252-260) - 
who, interestingly, did not discover any Proto-Sapiens first- or second-person pronoun 
root in spite of the pleasure he no doubt would have had in finding one. We have found 
that a majority of these pronoun roots are based on a handful of consonants, which, 
however, are distributed among the first and second persons in apparent disorder at 
the global level. A root m- may represent the first person singular in some phyla (like 
Eurasiatic or Niger-Congo), or the second person singular in others (like Amerind), 
and the same holds true of the other globally widespread pronominal root consonants 
t-, «-, k-, and s-, in spite of their monolithic coherence at the family-internal level. 

Another salient aspect of the phonetic distribution of pronominal root conso¬ 
nants is the near absence of plain oral labial stops {p-, b-), with very few exceptions, 
and those few are often demonstrably secondary, such as bi- ~ be- ‘I (nominative)’ in 
Altaic languages.While this global absence remains unexplained, its very existence 
must be considered as indicative of a relationship between all pronominal forms. Given 
that plain oral labial stops are among the most frequent consonants in the world’s lan¬ 
guages (Maddieson 1984, 1997) and are rather resistant to phonetic change, if first- 
and second-person pronouns had independent origins in many language families, a 
good number of them ought to be based on a root p- or b-. 

We then thought that first- and second-person pronouns (and first- and second- 
person markers more generally) may have emerged only with the fluent use of syntac¬ 
tic articulation, and the necessity to quickly differentiate the speaker and the hearer in 
a complex sentence. In the stages that preceded the evolution of syntactic articulation 
in a broad sense - stringing words together - words were mostly used in isolation, 
and a great proportion of the speaker’s intentions had to be inferred from the context. 
Words were, however, highly useful, as compared with no words at all, thanks to their 


over time. With words, one may only observe their loss as the ancestral language splits into 
multiple descendants. It does, however, give a good indication of their relative stability. 

11 . The Altaic language family consists of Turkic (Turkish, Uzbek, Kazakh, etc.), Mongolic 
(Classical Mongolian, Khalkha, Buriat, etc.), and Tungusic (Manchu, Evenk, Nanai, etc.); 
Korean and Japonic, thought by Greenberg to be related to the former groups at a greater 
remove within Eurasiatic, are often included within Altaic by Nostraticists. 
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property of referring to objects or actions known to the speaker and the hearer. They 
gave the hearer an anchor to infer the rest, in a world where human activities and 
interests were much more restricted and predictable than in any society known to us 
today. But first- and second-person pronouns have the strange and unique property of 
shifting reference with the speech turn. One does not see how such words, deprived 
of the essential quality of words at that time of referring to a stable object, could ever 
have appeared, nor what use they might have had - a single-word sentence “me” or 
“thou” would have given little information to the hearer. When syntactic articulation 
first began, verbs may only have been “action or state words,” with no mark of tense, 
voice, person, or number, just as they had been before, when used in isolation. As 
stringing words together became a widespread habit, then a norm, it became neces¬ 
sary to disambiguate the subject and object of verbs - very often the speaker or hearer 
themselves, the most interesting themes for two interacting individuals - with nouns 
used to address the hearer and self-refer to the speaker. Thanks to this repetitive use, 
the most frequent of these nouns must, by a process which remains unclear (although 
probably not forever), have evolved in shortened forms into first- and second-person 
pronouns. 

Our conclusion was that, at the time of Proto-Sapiens, personal pronouns were 
already being formed, since they are found in all language families (in spite of their 
not being absolutely necessary, albeit very useful) but were not yet fixed as a lexical 
category. The original nouns that had given rise to them still varied freely between 
referring to the speaker and referring to the hearer, according to their original nominal 
meaning, and only later were fixed onto either first or second person in each family. 
Since the very existence of first- and second-person pronouns is hardly conceivable 
without a syntactically articulated language, Proto-Sapiens at the time of its split must 
have been in the process of acquiring syntactic articulation. This process certainly took 
time, and perhaps lasted until late into Upper Paleolithic, judging by the fact that there 
are more reconstructed first- and second-person pronoun roots in ancient taxa, such 
as Eurasiatic (Greenberg 2000) or Nostratic (Bombard 2008; Dolgopolsky 2008), than 
in recent ones. The unexpected absence of a clear-cut distinction between first- and 
second-person pronominal roots at the global level would thus testify that syntactic 
articulation had begun to evolve before the dispersion of modern humans, and prob¬ 
ably was part of its success, but had not yet led to the development of full-fledged first- 
and second-person pronouns. 

It has been repeated recently that the origin of language is the most difficult sci¬ 
entific problem of our time. At the very least, it is certainly the most difficult problem 
resisting evolutionary theory. How could one hope to solve it without the powerful 
tool of comparative linguistics, which opens a window on past times as far back as 
the initial dispersion of our Homo sapiens ancestors? How could one hope to solve it 
without giving spoken words their legitimate due? 
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Appendices: Comparative data 

Appendix A. The Proto-Indo-European root *ma- ~ *mama- ‘mother’ 

[or, rather, ‘mother, mom’] 

The reference of data not drawn from Nikolayev (2007) is given when it is relatively 
difficult to access (i.e. essentially for the Indie, Nuristani, and Iranian groups); addi¬ 
tional data from European language groups have been drawn from standard dictionar¬ 
ies, often accessible on the Internet. 

Indie : Proto-Indic *md ‘mother’: Pali mdmikd ‘mother’; Prakrit mdu ‘mother’; Germany 

Gypsy mama ‘mother’; Romania Gypsy mdmi ‘grandmother’; Bashkarik mem 
‘mother’s mother’, mdm ‘mother’s father’; Phalula memi ‘mother’s mother’, momo 
‘mother’s father’; Domaki mdma ‘mother’; Tirahi md ‘mother’; Shina (Gilgiti 
dial.) md ‘mother’; Shina (Kohistani, Palesi) md ‘mother’; Shina (Guresi) mdh 
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Iranian: 


Armenian: 

Hellenic : 


Slavic : 

Baltic : 

Germanic: 


Italic: 


Celtic : 


Albanian: 


‘mother’; Sindhi mdu ‘mother’; Lahnda md ‘mother’; Lahnda (Awankari dial.) md 
‘mother’; Punjabi ma ~ mau ~ mdt ~ mdmmi ‘mother’; West Pahari (Curahi dial.) 
md ‘mother’; Kotgarhi md ‘mother’, mdi ‘mother, goddess Durga’; Kumauni md 
‘mother, mother-in-law’; Nepali mdu ‘female animal having given birth’; Assamese 
md ~ mdu ‘mother’, mdi ‘mother, mother’s brother’s wife’; Bengali md ‘mother’, 
mdi ‘breast’; Oriya mdd ~ md ‘mother’, mdi ‘woman; MaithUi mdi ‘mother’; Bhoj- 
puri mdi ‘mother’; Awadi (Lakhimpuri dial.) mdi ‘mother’; Hindi md ~ mdi ~ md 
‘mother’; Old Marwari md ‘mother’; Gujarati md ~ mdi ‘mother’; Marathi md ~ 
/noi‘mother’, mdi‘mother-in-law’ (Turner 1962-1966, etym. 10016 & 10058). 
Ossetic mama ‘mom’ (Abaev 1970); Yaghnobi momo ‘grandmother’ (Bird 2006); 
Wakhi mum ‘grandmother’ (Grierson 1920); Persian mdm ‘mom’, mdmd ‘midwife’ 
(Hayyim 1934-1936); Zaza ma ‘mother’ (Werner 2009). 
mam ‘grandmother’. 

Classical Greek maga ‘Earth Mother!’; (Homeric) mdia ‘address to an old woman; 
(Attic) mdia ‘mom, wet nurse, midwife’; mdmme ‘mom, granny’; (Doric) mdia 
‘grannny’; Standard Modern Greek mama ‘mom’, mammi ‘granny’. 

Proto-Slavic *mama ‘mom’: Belorussian mama ‘mom’; Russian mama ‘mom’; 
Ukrainian mama ‘mom’; Bulgarian mama ‘mom’; Serbo-Croatian mama ‘mom’; 
Slovene mama ‘mom’; Czech mama ‘mom’; Slovak mama ‘mom’; Polish mama 
‘mom’; Upper Sorbian mama ‘mom’; Lower Sorbian mama ‘mom’. 

Proto-Baltic *mama ‘mom’; Lithuanian mama, (dial.) mdmd ‘mom’; Latvian mdma 
‘mom’. 

Standard German Mama ‘mom’, Oma ‘granny’; Alemannic Mamme ‘mom’; Alsa¬ 
tian Mamma ‘mom’; Low German Marne ~ Mamme ~ Mamma ‘mom’; Dutch ma ~ 
mam ~ mama ‘mom’, oma ‘granny’; Danish mama ‘mom’; Swedish mamma ‘mom’; 
Norwegian mamma ‘mom’; Faeroese mamma ‘mom’; Icelandic mamma ‘mom’. 
Latin Maia ‘Great Goddess = Earth, associated with the cult of Vulcan, and mother 
of Mercury’, Maius ‘month of May’, mamma ‘mommy, mother, wet nurse’; Roma¬ 
nian mdmd ‘mother, mom’; Italian mamma ‘mom’; SursUvan mumma ‘mom’; 
Sutsilvan moma ‘mom’; Surmiran mamma ‘mom’; Puter mamma ‘mom’; Vallader 
mamma ‘mom’; Friulian mame ‘mom’; French maman ‘mom’, meme ~ mamie 
‘granny’; Occitan mama ‘mom’, mameta ‘granny’; Catalan mama ‘mom’; Spanish 
mama ‘mom’; Portuguese mamde ‘mom’. 

Proto-Celtic *mammd: Old Irish mam ‘mother’; Welsh mam ‘mother’; Breton mam 
‘mother’; Cornish mam ‘mother’; Proto-Celtic *mammid: Old Irish muimme ‘foster 
mother’. 

Tosk meme ‘mother’; Gheg mams ‘mother’. 


Appendix B. The Proto-Indo-European root *pa ~ *papa ‘father, dad’ 

Anatolian : Palaic pdpa ‘father’. 

Indie : 1 . Proto-Indie *bappa: Prakrit bappa ‘father’; Armenia Gypsy bap ‘father’; Dam- 

eli bap ‘father, grandfather’; Gawar-Bati bdp ‘father’; Torwali bdp ‘father’; Lahnda 
bdpu ‘grandfather’; Punjabi bdp, bdpu ‘father’; Nepali bdp ‘father’; Assamese bdp 
‘father’, bdpd ‘term of address to a father or of affection to a young man’, bdpu ‘term 
of address to a learned Brahman’; Bengali bdp ‘father’, bdpu ‘father, chUd’; Oriya 
bdpa ‘father’, bapd ‘term of endearment to younger persons’, bdpu ‘term of address 
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to a father or to a young person, (Puri dial.) bdpd ‘fathers father; MaithUi bdp, 
bappd ‘father’; Awadi (Lakhimpuri dial.) bdp ‘father’; Hindi bdp ‘father’; Guja¬ 
rati bdp ‘father’; Marathi bdp ‘father’; Sinhalese bapa ‘father’; West Pahari (Koci 
dial.) bdp ‘father’, (Kiuthali) bapu (used by Rajputs), bdpu-, Maldivian (upper class) 
bappa, (lower class) bafd ‘father’. 2. Proto-Indic *babba: Domaki baba ‘father, 
father’s brother’ (pi. pidra < pitf)\ Pashai (Areti dial.) bdba ‘father’; Shumashti 
bdbd\ Bashkarik bab ‘father’, bobd ‘father’s brother’; Savi bdb, bdbu ‘father’; Phal- 
ula bdbu ‘father’, bdba ‘father’s brother’; Shina (Gilgiti dial.) bdbu ‘father’, (Palesi) 
bubd-, Kashmiri bab ‘father, grand-father’, bdb ‘father’, (Rambani dial.) babb ‘father’, 
(Poguli) baub ‘father’, (Dodi) babbo ‘father’; Punjabi bdbbd ‘father, grandfather’, 
bdbu ‘term of respect’, (Kangra dial.) babb ‘father’; West Pahari (Bhadrawahi dial.) 
bdbd ‘father’, (Bhalesi) bdb ‘father’, (Curahi) bdbb ‘father’, (Cameali) babb ‘father’, 
(Khashali) babb ‘father’ (voc. bdvd)\ Kumauni bdbu ‘father’, babd ‘affectionate term 
for father or child’; Nepali bdbu ‘father’, bdbai ‘term of address to child’, babuwd 
‘father, (Tarai dial.) affectionate term for son’; Bengali bdbd ‘father, baby’, bdbu 
‘gentleman; Oriya bdbd ‘father’, babd ‘father’s elder brother’, bdbu ‘gentleman, 
babud ‘term of endearment to juniors’; Maithili bdbd ‘father’, bdbu ‘title of respect’; 
Hindi bdbu ‘father’, babuwd ‘child’; Gujarati bdbu ‘term of respect’; Marathi bdbu 
‘term of respect’; Marathi bdbdd ‘term of endearment to a child’; West Pahari (Koci 
dial.) bdb ‘father’, (Kiuthali) babu ‘father’ (used by Rajputs), bdbu ‘father’. (Turner 
1962-1966: etym. 9209) 

Nuristani : Kata-vari (Ktivi dial.) vov ‘grandfather’; Kamv’iri vov ‘grandfather’; Supu-vari vd 
‘grandfather’; Sanu-viri bdba ‘elder brother’; Usiit-vare vdv ‘grandfather’, bab ‘elder 
brother’; Va-ala bdba ‘elder brother’; Ames-ala bdba ‘elder brother’; Nisei-ala bdba 
‘elder brother’. (Strand 1997-2008) 

Iranian : Khwarezmianpapa, bdb ‘father’ (Rybatzki 2006); Sogdian bdbay ‘father’ (Rybatzki 

2006); Yaghnobi bobo ‘grandfather’ (Bird 2006); Bactrian babu ‘masc. personal 
name’ (Rybatzki 2006); Pashto bdbu ‘dad, address term to an elder’, bdbd ‘grand¬ 
father’ (Kabir & Akbar 1999; Schurmann 1962); Wakhi pup ‘grandfather’ (Gri¬ 
erson 1920); Sanglechi bobo ‘father’s father’ (Rybatzki 2006); Ishkashmi bdbd 
‘grandfather’ (Grierson 1920); Shughni bub ‘grandfather’ (Skold 1936); Bajui bdb 
‘grandfather’ (Skold 1936); Sahdara bdb ‘grandfather’ (Skold 1936); Bartangi bdb 
‘grandfather’ (Skold 1936); Yazghulami bdb ‘grandfather’ (Skold 1936); Parachi bdw 
‘father’, bdbd ‘grandfather’ (Rybatzki 2006); Pahlavi bdbd ‘first part of masc. name’ 
(Rybatzki 2006); Farsi bdbd ‘father, grandfather’ (Rybatzki 2006); Basseri Farsi ba° 
‘father’, bdbd grandfather’ (Rybatzki 2006); Dari bdbd ‘grandfather, father, dad’ 
(Rybatzki 2006); Tajik baba ~ bawa ~ baab ‘father’, bdbd ‘ancestor’ (Schurmann 
1962); Baluchi bdbd ‘elder man (Rybatzki 2006); Marri Baluch baba ‘father, grand¬ 
father, grandfather’s brother’ (Pehrson 1966); Hazara bdbd ‘father’ (Schurmann 
1962); Kurdish bav ‘father’, bavo ‘dad’, bapir ‘grandfather’ (Rybatzki 2006); Zaza 
bao ‘dad (vocat.)’ (Keskin no date). 

Armenian : pap ‘grandad’. 

Hellenic : Classical Greek pappa ‘dad’, pappos ‘grandfather, forebear, ancestor’; Modern 

Pontic Greek papa ‘dad’ (Fauvin & Nikaki, personal communication); Standard 
Modern Greek baba ‘dad’ (borrowed from Turkish, see Chantraine 1968),pappous 
‘grandfather’. 

Baltic : Latvian paps ‘dad’. 
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Germanic : Gothic papa ‘dad’; Modern High German Papa ‘dad’, Opa ‘grandad’; Alsatian Papa 
‘dad’; Alemannic Pappe ‘dad’; Rhine Franconian Pappe ~ Babbe ‘dad’; Bavarian 
Babba ‘dad’; Dutch pa ~ papa ~ pappa ‘dad’, opa ‘grandad’; English papa-, Danish 
papa ‘dad’; Swedish pappa ‘dad’; Norwegian pappa ‘dad’; Faeroese pdpi ‘dad’; Ice¬ 
landic pabbi ‘dad’. 

Italic : Latin pappa ‘dad’, pappus ‘grandfather, ancestor’; French papa ‘dad’, pepe ~ papy 

‘grandad’; SursUvan bab ‘father’; SutsUvan bab ‘father’; Surmiran bab ‘father’; 
Puter bap ‘father’; VaUader bap ‘father’; Friulian pai ‘dad’; Italian papa ~ babbo 
‘dad’; Occitan papa ‘dad’, papet ‘grandad’; Catalan papa ‘dad’; Spanish papa ‘dad’; 
Portuguese pai ~ papa ~ papai ‘dad’. 

Albanian : baba ‘dad’ (borrowed from Turkish, Meyer 1891). 

Appendix C. The Proto-Indo-European root *tat- ~ *tet- ‘father’ 

[or, rather, *tata ‘dad, father’] 

Anatolian : Hieroglyphic Luwian tati(a)- ‘father’; Luwian tati(ja)- ‘father’; Lycian tedi ‘father’. 

Indie : Sanskrit tdtd ‘(vocative) affectionate address to junior’ (Satapatha Brahmana), ‘idem 

to senior’ (Mahabharata), ‘father’ (ibid.), tatd ‘father’ (Rig Veda); Pali tdta ‘term of 
respectful or affectionate address to an elder or younger’; Prakrit tda ‘father, son; 
Germany Gypsy tatta ‘father’; Pasai (Darrai-i Nur and Wegali dial.) tati ‘father’; 
Khowar tat-. Old Gujarati tdya m. (Turner 1962-1966: etym. 5754). Proto-Indic 
*dddda ‘father or other elderly relative’: Germany Gypsy dad ‘father’; Domaki dado 
‘grandfather’; Dameli dddi ‘father’; Pasai (Laurowani dial.) dadd ‘elder brother’, 
(Gulbahari) dadd ‘father’, (Kurangali) dado ‘father’s brother’; Kalasha dada ‘father’; 
Bashkarik dad ‘grandfather’, ded ‘grandmother’; Phalula dodo ‘father’s father’, dedi 
‘father’s mother’; Shina dado ‘grandfather’, dddi ‘grandmother’; Sindhi dado ‘father’s 
father’, dddi ‘father’s mother’, (Kacchi dial.) dado ‘grandfather’; Lahnda dadd ‘father’s 
father’, di ‘father’s mother’, dadd m., di f; Punjabi daddd, da m., daddi, di f; West¬ 
ern Pahari (Bhalesi dial.) dado m., (Kotgarhi) dad ‘father’s father, elder brother’, 
daddi ‘father’s mother’, (Kiuthali) dadd ‘grandfather’; Kumauni dadd ‘grandfather, 
elder brother’, dddi ‘grandmother, elder sister’, da ‘address to an elder brother’; 
Nepali dadd ‘old servant’, ddjyu, ddi (contaminated by bhdi < bhrdtf [Proto-Indic 
form of PIE *bhrat3r ‘brother’, PJB & AME]) ‘elder brother’; Assamese dadd ‘elder 
brother’; Bengali dadd ‘grandfather, elder brother’, dddi ‘grandmother’; Oriya dadd 
‘grandfather, father’s brother, elder brother’; MaithUi dadd ‘grandfather’; Hindi 
dadd ‘father’s father, elder brother’, dddi ‘father’s mother’; Gujarati dado ‘father’s 
father’, dddi f; Marathi dadd ‘elder brother’, dddi ‘respectful term for an old woman. 
(Turner 1962-1966: etym. 6261) 

Nuristani : Kata-vari (Ktivi dial.) to ‘father’, -to ‘father’s (brother)’; Kamv’iri tot ‘father’, -tot 
‘father’s (brother)’; Va-ala tdta ‘father’, -ta ‘father’s (brother)’, el-ta ‘grandfather’ 
(cf ei ‘mother’, el-ei ‘grandmother’); Ames-ala tdta ‘father’, -tdta ‘father’s (brother)’, 
garpta ‘grandfather’ (cf garpei ‘grandmother’); Nisei-ala tati ‘father’, -tati ‘father’s 
(brother)’ (Strand 1997-2008). 

Iranian : Old Avestan td ‘father’ (Yasna 47.3); Yaghnobi dodo ‘father’ (Bird 2006); Shughni 

tat ‘father’ (Mancino no date); Roshani taat ‘father’, tatek ‘grandfather’; Ishkashmi 
tot ~ tat ‘father’ (Grierson 1920); Wakhi tat ‘father’ (Grierson 1920); Zebaki 
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tat ~ td ‘father’ (Grierson 1920), Pashto dadd ‘a term of endearment to a father 
or elder brother (East), also elder sister (West)’ (Raverty 1867); Zaza ded ‘father’s 
brother’, dedo ‘idem (voc.)’ (Werner 2009); Talysh dada ‘father’ (Schulze 2000); 
Baluchi dada ‘father’s father’ (Mumtaz 1985). 

Greek : Classical Greek (Myrin.) tatd (voc.) ‘daddy’, Homeric tetta. 

Slavic : Proto-Slavic *tata ‘father, dad’: Pskov, Arkhangelsk, Eastern and Southern dialects 

of Russian tata ‘father, dad’; Bulgarian tato ‘father, dad’; Serbo-Croatian tdta ‘father, 
dad’; Slovenian tdta ‘father, dad’; Czech tdta ‘father, dad’; Polish tdta ‘father, dad’; 
Lower Sorbian tdta ‘father, dad’; Upper Sorbian tdta ‘father, dad’. 

Baltic : Proto-Baltic *tet-ia-, *tet-id-: Lithuanian teti-s ‘father’, tete ‘dad’; East Lithuanian 

tete ‘father’; Samogitian titi-s, dial, tditi-s ‘father’; Latvian tete, tetis ‘dad’. 

Italic : Latin tdta ‘dad’; Old Castilian taita ‘dad’ (Nebrija 1492), Old Catalan taita ‘dad’; 

Catalan (dialectal) tata ‘dad, brother’; Neapolitan tdta ‘dad’; Romanian tatd 
‘father, dad’; Sursilvan tat ‘grandfather’; SutsUvan tat ‘grandfather’; Surmiran tat 
‘grandfather’. 

Celtic : Old Cornish tat ‘father’ (Vocabularium Cornicum c. 1250); Cornish tat ‘father’; 

Middle Welsh tad ‘father’ (Charles-Edwards 2003); Welsh tad ‘father’, dada ‘dad’; 
Middle Breton tat ‘father’ (Izard 1965); Breton tad ‘father’, tata ‘dad’; Old Irish data 
‘foster father’ (Charles-Edwards 2003). 

Albanian : tate ‘father’. 

Appendix D. The Proto-Dravidian root *appa ‘dad, father’ 

Tamil appan, appu ‘father (term of endearment used to little children or inferiors)’, appacci 
‘father’, appdttai ‘elder sister’, appi ‘mistress of house, elder sister’; Malayalam appan 
‘father’, appu ‘affectionate appellation of boys’; Kannada appa ‘father (frequently added 
to the proper names of men as a term of common respect; used endearingly to children 
by elders)’, apa ‘father’, appu ‘affectionate appellation of boys’; Kodagu appe ‘father’; Tulu 
appa, appe ‘affix of respect added to proper names of men, appe ‘mother’, appa ‘a mode 
of calling a mother’; Telugu appa ‘father, mother, elder sister (frequently added to names 
of men as a term of common respect)’; Kolami appa ‘father’s sister’; Condi dpordl ‘father’, 
maipo ‘my father’, mt-dpd ‘thy father’; Maria tape ‘father’; Konda tappe, (L.) tdpe ‘father’ 
(Voc. 1656); Koya Su. tappe ‘(his, her) father’; Konda aposi ‘father (with reference to third 
person)’. (Burrow & Emeneau 1984, etym. 156) 


Appendix E. The Proto-Turkic roots *ata ‘dad, father’, *apa ‘dad, father’, 
and *ana ‘mom, mother’ 

1 . Proto-Turkic *ata ‘father’ : Old Uighur ata ‘father’; Sary-Yughur ata ‘father’; Nogai ata 
‘father’; Oirat ada ‘father, ancestor’; Karakhanid ata ‘father’; Turkmen ata ‘father’s father’; 
Azeri ata ‘father’; Balkar ata ‘father’; Tuvin ada ‘father’; Middle Turkish ata ‘father’; Tatar 
ata ‘father’; Kumyk ata ‘father’; Tofalar ada ‘father’; Uzbek ota ‘father’; Kirghiz ata ‘father, 
ancestor’; Karakalpak ata ‘ancestor’; Modern Turkish ata ‘ancestor’; Bashkir ata ‘father’; 
Uighur ata ‘father, ancestor’; Urum ata ‘father’; Cuman ata ~ atta ‘father’; Kazakh ata 
‘father’; Khakassian ada ‘father’; Karaim ata ‘ancestor’. (Dybo 2006) 
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2. Proto-Turkic *apa ‘father’ : Orkhon apa ‘ancestors’; Old Uighur apa ‘ancestors’; Salar aha ~ 
apa ‘father’; Bashkir (dial.) apa ‘father’; Sary-Yughur awa ‘father’; Khakassian aha ‘father’; 
Karakhanid apa ‘father, ancestor’; Tatar (dial.) aha ‘father’; Tuvin ava ‘father’; Turkish aha 
‘father’; Kirghiz aha ‘father’; Altai aha ‘father, bear’; Azeri (dial.) aha ‘father’; Balkar appa 
~ aha ‘father’; Chuvash oha ‘bear’; Turkmen (dial.) aha ‘father’. (Dybo 2006) 

3. Proto-Turkic *ana ‘mother, mom’ : Old Uighur ana ‘mother’; Karakhanid ana ‘mother’; 
Azeri ana ‘mother’; Dolgan ihe ‘mother’; Gagauz ana ‘mother’; Turkmen ana ‘mother’; 
Tuvin ije ‘mother’; Karaim ana ‘mother’; Middle Turkish ana ‘mother’; Khakassian ina 
‘mother’; Kirghiz ene ‘mother’; Karakalpak ana ‘mother’; Oirat ene ‘mother’; Kazakh ana 
‘mother’; Salar ana ‘mother’; Uighur ana ‘mother’; Chuvash ahne ‘mother’; Bashkir ina 
‘mother’; Sary-Yughur ana ‘mother’; Kumyk ana ‘mother’; Yakut ije ‘mother’; Balkar ana 
‘mother’. (Dybo 2006) 


Appendix F. The origin of words for ‘dad’, ‘father’, ‘mom’, and ‘mother’ 
in the Chinese family. 

The two databases (Starostin 2006; Wang 2004) from which the following data have 
been drawn differ in their respective transcriptions of oral stops; we have aligned them 
according to Wang’s transcription. Superscript numbers following modern dialectal 
forms in Wang’s data transcribe tonal contours. 

1 . ‘Dad’ : Preclassic pa? ‘father’; Classic pd; Western Han pd; Eastern Han pwd; Early Post¬ 
classic ptvd; Middle Postclassic pwd; Late Postclassic pwd; Middle Chinese pwa (Starostin 
2006). Modern forms: Beijing pa^; Jinan pa^; Xi’an po^; Taiyuan pa^; Hankoupa^^; Chengdu 
pa^^; Yangzhoupai Suzhou pa'h Wenzhou pa"; Changsha pa'*; Shuangfeng po^^; Nan- 
changpu/c"**; Meixianpa"; Guangzhou pa**; Xiamen pa^^; Chaozhou pa**; Fuzhou pa**; 
Shanghai pa* (Wang 2004). 

2. ‘Father’ : Preclassic pah?-. Classic pd; Western Han pd; Eastern Han pwd; Early Postclassic 
pw6-. Middle Postclassic pwd; Late Postclassic ptvd; Middle Chinese pii (Starostin 2006). 
Modern forms: Beijing fu^-, Jinan fu^-, Xi’an fu^-, Taiyuan fu^-, Hankou fu^-, Chengdu fu^-, 
Yangzhou/«^; Suzhou/«^^; Wenzhou voy^f Changsha/«^*; Shuangfeng Nanchang 

Meixian/u^; Guangzhou/«^^; Xiamen (lit.),pe^^; Chaozhou pe^^; Fuzhou 

Shanghai Zhongyuan yinyun/«^ (Wang 2004). 

3. ‘Mom’ : Preclassic mf'd?-. Classic m*’d; Western Han mf'd-. Eastern Han mf'd-. Early Postclassic 
m*'d; Middle Postclassic mf'6-. Late Postclassic m*’d; Middle Chinese mo (Starostin 2006). 
Modern forms: Beijing ma**; Jinan ma**; Xi’an ma**; Taiyuan ma*; Hankou ma**; Chengdu 
ma**; Yangzhou ma**; Suzhou ma**; Wenzhou ma^f Changsha ma**; Shuangfeng mo**; 
Nanchang ma**; Meixian: ma**; Guangzhou ma**; Xiamen ma**; Chaozhou ma**; Fuzhou 
ma**; Shanghai ma*; Zhongyuan yinyun ma^ (Wang 2004). 

4. ‘Mother’ : Preclassic ma?; Classic ma. Western Han ma; Eastern Han ma; Early Postclassic 
md; Middle Postclassic maw; Late Postclassic maw; Middle Chinese mAw ‘mother’ (Staros¬ 
tin 2006). Modern forms: Beijing m«^; Jinan muf Xi’an m«^; Taiyuan m«^; Hankou muf 
Chengdu muf Yangzhou mof Suzhou mo**; Wenzhou mo^f Changsha mof Shuangfeng 
muf Nanchang muf Mebcian mu**; Guangzhou mou^f Xiamen hu^ (lit.), hof Chaozhou 
bo^*; Fuzhou muf Shanghai mu^; Zhongyuan yinyun mu^ (Wang 2004). 
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Appendix G. The descent of Proto-Indo-European *deik’e- ‘to show, 
to point’ and *dekm- ‘ten’. 


1 . Proto-Indo-European *deik’e- ‘to show, to point’: 

Anatolian : Hittite: tekk-ussai- ‘to show’. 

Indie : Proto-Indic *dis ‘to point’: 

(a) Sanskrit (Rig Veda) dis-dti ‘points out’, dts ‘direction, region, (Mahabharata) 
dts-d ‘direction; Pali dis-dti ‘points out’, disd ‘id.’, (Vajasaneyi-Sariihita) desd ‘point, 
region, part’; (Ramayana) desd ‘province, country’; Prakrit dis-fli'‘teUs’, dis-d ‘direc¬ 
tion, desa ‘part, country’; Old Gujarati dis-i ‘direction’; Old Awadhi dih ‘direction, 
desa ‘country’; Armenia Romany les ‘earth, world, life’; Palestine Romany des ‘place, 
camping ground’; Kalasha (Rumbur dial.) des ‘country’, desa ‘far, distant’; Phalula 
des ~ dts ‘village’; Pashai des ‘cultivated field’; Torwali dis-d ‘towards’; Kashmiri 
(Kashtawari dial.) dis ‘country’; Shina dts ‘place’; Sindhi dehu ‘country’; Western 
Pahari (Bhadrawahi dial.) des ‘village’; Kumauni des ‘country’; Nepali des ‘country, 
plains of India’; Bengali des ‘country’; Oriya desa ‘country’; Maithili des ‘country’; 
Assamese dih ‘means, direction’; Hindi dis-na ‘to show, to exhibit’, dis ‘direction, 
side’, des ‘country’; Marwari des-ro ‘small country’; Gujarati des ‘country’; Marathi 
des ‘country’; Sinhalese das-aya ‘direction’, desa ‘country’. (Turner 1962-1966: 
etym. 6339, 6340, 6547) 

(b) Sanskrit (Kausikasutra) dis-ti ‘a measure of length’; Shina (Gilgiti dial.) di-t 
‘span’, (Jijelut dial.) dis ‘span; Dameli dis-t ‘span; Khowar dis-t ‘handspan’; Kalasha 
jis-t ‘handspan; Phalula dis-t ‘span. (Turner 1962-1966: etym. 6343) 

Nuristani : Ashkun desi’ ‘village’; Kalasha-ala (Waigali) des ‘village’. (Turner 1962-1966: etym. 
6340) 


Iranian : (a) Old Avestan d-dis-ti- ‘direction’; Avestan dais ‘to show’, dax-Ua- ‘sign, revela¬ 

tion; Khotanese dis- ‘to confess’; Sogdian p-d’ys ‘to show’; Parthian ‘dys-g ‘sign; 
‘b-dys- ‘to show’; Ossetic dis ~ des ‘amazement’; cev-dis-yn ~ cev-des-un ‘to show’. 
(Lubotsky no date). 

(b) Avestan dis-ti ‘a measure of length’; Khotanese di-thi ‘a measure of length’. 
Greek : (a) Class. Greek detk-numi ‘to show’; Cretan dik-nuti ‘to show’; Mod. Greek deix- 

no ‘to show’. 

(b) Classical Greek (?) dak-tulos ‘finger’; Modern Greek (?) dak-tilo ‘finger’. 
Germanic : Proto-Germanic (a) *ga-tihan ‘to announce, teU’, *taik-n ‘token, *taik-njan ‘to 
show, to manifest’; (b) *taih-wd ‘toe’: 

(a) Old Norse tjd ‘to show’; Old High German zeig-dn ‘to show’, zih-an ‘to accuse’, 
zeihh-an ‘sign’, ziht ‘accusation; Old Franconian teik-in ‘sign; Old Frisian tig-ia 
‘to show’, tek-an ‘sign’; Old English te-on ‘to show’, tac-an ‘to teach’, tdc-en ‘sign; 
Middle High German zeig-en ‘to show’, zih-en ‘to accuse’, zeich-en ‘sign, example’; 
Middle Low German tie-n (participe tig-en) ‘to show’, tek-en ‘sign; Middle Dutch 
tie-n ‘to show’, tek-en ~ teik-en ‘sign’, tiht ‘accusation; Icelandic tig-n ~ teik-na ‘give 
a sign, Faeroese tek-na ‘to show’, tek-n ‘sign’; Norwegian te ‘to show’, teik-n ‘sign; 
Swedish te ‘to show’, teck-en ‘sign; Danish te ‘to show’, teg-n ‘sign’; English teach, 
tok-en-, Dutch aan-tijg-en ~ op-tijg-en ~ be-tijg-en ‘to show’, tek-en ‘sign’, dial. 
teiken ‘sign; German zeig-en ‘to show’, Zeich-en ‘sign, Alsatian zaig-a ‘to show’, 
Zaich-a ‘sign’. 
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(b) Old Norse td ‘toe’; Old High German zeh-a ‘toe’; Old English tdh-e ‘toe’; Mid¬ 
dle Low German tewe ‘toe’; Icelandic td ‘toe’; Faeroese td ‘toe’; Norwegian td ‘toe’; 
Swedish td ‘toe’; Danish td ‘toe’; English toe-, Dutch teen ‘toe’; German Zehe ‘toe’; 
Alsatian Zecha ‘toe’. 

Proto-Baltic *teig- ‘to tell’; Old Lithuanian tieg ‘he said’; Lithuanian teig-ti ‘to say, 
tell, claim’. 

(a) Oscan detk-um ‘to say’; Umbrian tik-amne ‘dedicace’; Latin dic-ere ‘to say’, dic- 
tid ‘discourse, ju-dex ‘judge’ (telling ju-s, the law); French di-re ‘to say’, in-diqu-er 
‘to show’; Occitan dts-er ~ digu-er ‘to say’; Catalan di-r ‘to say’; Aragonese dt ‘to 
say’; Spanish dec-ir ‘to say’; Portuguese diz-er ‘to say’; SursUvan di-r ‘to say’; Sutsil- 
van gi-r [dzir] ‘to say’; Surmiran dei-r ‘to say’; Puter di-r ‘to say’; VaUader di-r ‘to 
say’; Friulian dt ‘to say’; Italian di-re ‘to say’; Romanian zic-e ‘to say’. 

(b) Latin in-dex ‘indicative, index finger’, dig-itus ‘finger’; French doig-t ‘finger’; 
Occitan de-t ‘finger’, en-dei-s ‘index finger’; Catalan di-t ‘finger’; Spanish de-do ‘fin¬ 
ger’; Portuguese de-do ‘finger’; Sursilvan de-t ‘finger’; Sutsilvan de-t ‘finger’; Surmi¬ 
ran de-t ‘finger’; Puter dau-nt ‘finger’; VaUader dai-nt ‘finger’; Friulian de-t ‘finger’; 
Italian di-to ‘finger’; Romanian deg-et ‘finger’. 

2. Proto-Indo-European *dekm- ‘ten’ (all reflexes below also mean ‘ten’): 

Indie : Proto-lndic *dasan-, Vedic ddfa; Prakrit dasa ~ daha-, Pali dasa-, Asokan dasa ~ 

dasa-, Apabhrariisa dasa ~ daha-, European Romany des-, Armenia Romany las-, 
Palestine Romany das-, Gondwani dhantak-, Dameli das-, Domaki dai-, Tirahi dd-, 
Poguli ddh-, Rambani das-, Kohistani das-, Pashai ddya-, Shumashti das-, Ningalami 
das-, Wotapuri das-, Gawarbati dos-, Kalasha das-, Khowar/os; Bashkarik das-, Torwali 
das-, Kandia das-, Maiya das-, Savi das-, Phalula das-, Shina dai-, Kashmiri dah-, Ram¬ 
bani das-, Poguli ddh-, Dodi dds-, Sindhi daha-, Khatri do-, Kacchi dau-, Lahnda ddh-, 
Khetrani dd-, Awankari dd-, Punjabi das-, Siraiki dah-. Western Pahari das-, Kotgarhi 
lbs-, Garhwali das-, Kumauni das-, Nepali das-, Assamese dah-, Mayang dos; Bengali 
das-, Oriya dasa-, Bihari das-, Maithili das ~ dah-, Magahi das-, Bhojpuri das-, Awadi 
das-, Lakhimpuri das-, Hindi das-, Bhili ddh ~ dah-, Dogri das-, Chattisgarhi das-, 
Khandeshi das-, Braj das-, Bundeli das-, Urdu das-, Rajasthani das-, Malvi das-, Magaji 
das-, Marwari das-, Gujarati das-, Marathi das ~ daha-, Konkani dhd-, Sinhalese dasa- 
ya ~ daha-ya-, Maldivian diha. (Turner 1962-1966: etym. 6227; Rosenfelder no 
date) 

Nuristani : Kalasha-ala (Waigali) dds-, Wasi-weri lez-, Kati due-, Kamviri d’uf-, Ashkun dus. 
(Turner 1962-1966: etym. 6227) 

Iranian : Avestan dasa-, Pahlavi dah-, Khotanese dasau-, Khwarezmian dhs-, Turfanian dh-. 

Iron Ossetic does-, Digor Ossetic dees-, Yaghnobi das-, Pashto las-, Wakhi das-, Munji 
dah-, Ishkashmi da-, Sanglechi das-, Zebaki dos-, Shughni bis-, Yidgha los-, Rushani 
des-, Yazgulami dus-, Sarikoli des-, Parachi dds-, Ormuri das-, Nayini de-, Natanzi d’e-, 
Khunsari de'-, Gazi de-, Sivandi da-, Vafsi dah-, Semnani das-, GUaki da-, Mazanderani 
da-, Talysh da-, Harzani doh-, Zaza des-, Gorani da-, Balochi dah-. Southern Kurdish 
da-. Northern Kurdish da-, Persian dah-, Tajik dah-, Tati doeh-, Chali dd-, Farsi dasa-. 
Lari da-, Luri dah-, Kumzari da’hata. 

Tocharian : Tocharian A sdh, Tocharian B sak. 

Armenian : Classical Armenian t’asn-. Western Armenian tas. 

Hellenic : Classical Attic Greek deka-, Aeolic deko-. Modern Greek deka-, Tsakonian deka-, 

Cypriot dega-, Pontic deka ~ reka. 
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Slavic : Proto-Slavic *des-^t’\ Old Church Slavonic des-fti-, Russian des-jat’-, Belorussian 

dzes-jac, Ukrainian des-jat’-, Polish dzies-i^c-, Kashuhian dzes-inc, Polabian dis-qt, 
Czech des-et-, Slovak des-at’-. Eastern Slovak dzes-ec. Upper Sorbian dzes-ac. Lower 
Sorbian zas-es-, Bulgarian des-et-, Serbo-Croatian des-et-, Slovene des-et-, Macedo¬ 
nian des-et. 

Baltic : Proto-Baltic *decim-t--. Old Prussian dessim-pts-, Lithuanian desim-tis-, Latvian 

desm-its. 

Germanic : Proto-Germanic *ttxun-. Gothic tathun-. Old Norse tiu-. Old Icelandic tio-. Old Swed¬ 
ish tio-. Old Danish ti-. Old High German zehan-. Old Saxon tehan-. Old Lrisian 
tian-. Old Low Lranconian ten-. Old English tine-. Middle Low German tein-. Mid¬ 
dle Dutch thien-. Middle High German zehen-, Icelandic tio; Norwegian tio ~ tie-, 
Swedish tio-, Dalecarlian tin-, Laeroese ttggju-. West Lrisian tsien-, Saterland Lrisian 
tjoon-, Lohr North Lrisian tjiin-, Sylt North Lrisian tiin-, Helgoland North Lrisian 
tain-, Dutch tien-. Low Saxon tain-, Westphalian Saxon tein-, Crimean Gothic thiine-, 
English ten-, German zehn-. Bavarian zene-, Swabian zaen-, Cimbrian zegan-, Rhine 
Lranconian zeen-, Luxemburgish zeng-, Swiss German zdh. 

Italic : Latin decern-. Old Lrench dis-, Lrench dix-, Walloon dijh-, Jerriais dgix-, Picard dich-, 

Poitevin dis-, Occitan detz-. North Occitan die-, Pranco-Proven<;al dyi-, Aragonese 
deu, dech-igueit eighteen; Catalan deu-, Spanish diez-, Ladino dies-, Asturian diez-, 
Galician dez-, Portuguese dez-, Sursilvan diesch-, Sutsilvan diesch-, Vallader desch, 
Lriulian dis-, Ladin diesc-, Piedmontese des-, Milanese des-, Genovese dexe-, Venetian 
diese-, Corsican dece-, Umbrian desce-, Neapolitan riece-, Sicilian decis-, Italian died-, 
Sardinian deghe-, Romanian zece-, Arumanian date-, Meglenian zeti. 

Celtic : Gaulish decam-. Old Irish deich-, Irish deich-, Scottish Gaelic deich-, Manx Jeih; Welsh 

deg-, Breton dek-, Vannetais dek-, Cornish dek. 

Albanian : Standard Albanese dhjete-, Gheg det-, Tosk zjete. 
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